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1*  Introduction 

A  goal  that  many  researchers  in  automated  teaching  would  like  to  achieve 
is  the  development  of  branching  procedures  or  designs  of  teaching  programs 
that  would  in  some  sense  be  best  tailored  to  the  needs  of  the  individual 
student.  This  paper  is  concerned  essentially  with  giving  theoretical 
foundations  in  terms  of  statistical  decision  theory  to  the  pursuit  of  that 
goal  which  for  the  most  part  has  enjoyed  the  status  of  a  nice  but  vague  idea. 
Several  alternative  design  problems  are  formulated  in  this  paper  and  a  general 
technique  for  solution  for  best  designs  is  outlined  and  illustrated. 

The  design  problem  for  automated  teaching  programs  or  experiments  is 
attacked  in  this  paper  from  the  standpoint  of  the  theory  of  the  sequential 
design  of  experiments.  This  seemed  to  be  an  especially  interesting  way  to 
look  at  the  design  of  automated  teaching  experiments  since  the  use  of  a  high¬ 
speed  computer  in  conducting  these  experiments  provides  at  least  the  possibility 
of  making  rapid  calculations  about  whether  a  teaching  experiment  should  be  con¬ 
tinued,  and  if  so,  what  type  of  item  should  be  presented  at  the  next  trial. 

Part  2  of  this  report  is  devoted  to  outlining  the  general  theory  of  the 
sequential  design  of  experiments  and  the  use  of  Bayesian  procedures  for  determin¬ 
ing  best  designs.  Sequential  experimentation  in  general  is  distinguished  from 
fixed  sample- size  experimentation  in  that  explicit  consideration  is  given  in 
sequential  procedures  to  the  cost  of  administering  each  subexperiment.  In  a 
sequentially  designed  experiment,  the  strategies  available  to  the  statistician 
or  esqperimenter  are  made  up  of  three  components:  the  choice  of  an  experiment, 
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the  choice  of  a  sampling  plan,  and  the  choice  of  a  terminal  decision  function* 
In  designing  an  automated  teaching  program  these  three  components  of  a  strategy 
respectively  may  be  interpreted  as:  the  choice  of  a  rule  for  deciding  what 
sequence  of  items  shall  be  administered  based  on  earlier  item  administrations 
and  responses,  the  determination  of  the  conditions  under  which  the  teaching 
experiment  should  be  stopped  for  a  student  based  on  the  set  of  possible  out¬ 
comes  of  an  experiment,  and  the  determination  of  just  what  conclusion  should 
be  reached  about  the  student’s  mastery  of  concepts  at  the  termination  of  the 
program . 

An  effort  was  begun  by  Dear  and  Atkinson  [ 11 J  to  examine  the  branching 
problem  in  automated  teaching  by  formulating  a  mathematical  model  of  a  teaching 
situation  and  then,  within  the  framework  of  this  model, mathematically  searching 
for  the  best  branching  rule  in  a  certain  sense  in  a  broad  class  of  branching 
rules.  A  rather  simple  two-concept  teaching  situation  was  considered  in  this 
study  in  order  to  allow  us  to  get  some  insight  into  the  structure  of  these 
branching  problems  without  being  overwhelmed  by  the  details  that  a  mathematical 
representation  of  many  current  automated  teaching  programs  would  necessitate. 

This  two- concept  teaching  model  was  formulated  in  terms  of  the  stimulus- 
sampling  mathematical  learning  theory  which  was  originally  developed  by  Estes 
and  is  currently  being  widely  applied  and  extended  by  many  researchers.  Two 
concepts  labeled  A  and  B  were  considered  in  our  previous  study  and  sets  of 
equivalent  items  which  embodied  these  concepts  were  assumed  available  for 
presentation  at  each  trial.  This  two-concept  model  will  be  examined  further  in 
this  paper  since  it  is  a  sufficiently  rich  basis  for  study  of  the  sequential 
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design  problems  which  are  to  be  considered  here  and  it  will  quite  well  account 
for  the  results  of  a  number  of  interesting  learning  studies  that  have  been 
carried  out. 

Part  3  of  this  paper  is  devoted  to  a  fairly  detailed  review  of  the 
foundations  of  stimulus  sampling  learning  theory  for  the  purpose  of  introducing 
the  features  of  that  theory  which  are  needed  to  characterize  a  sequentially 
designed  stimulus  sampling  teaching  experiment  and  to  identify  certain  para¬ 
meters  of  these  models  which  could  lead  to  a  number  of  different  statistical 
decision  problems.  In  Part  h,  a  number  of  alternative  sequential  design 
problems  are  developed  in  terms  of  several  different  identifications  of  states 
of  nature  or  parameters  of  the  relevant  probability  distributions;  several 
different  objective  functions;  and  several  alternative  ways  of  expressing 
losses  incurred  by  terminal  decisions.  The  point  is  emphasized  in  Part  k  that 
solutions  for  best  sequential  designs  of  teaching  experiments  have  the  difficult 
complication  over  many  current  sequential  design  problems  that  the  probability 
distributions  on  the  sample  spaces  of  these  teaching  experiments  cannot  be 
simply  broken  down  over  trials  into  independently  distributed  marginal 
components. 

The  technique  of  solution  for  best  sequential  designs  of  experiments 
called  "backward  induction"  is  outlined  in  Part  5*  Solutions  for  best  designs 
in  several  miniature  3-trial  teaching  experiments  using  this  technique  are  then 
illustrated.  The  paper  concludes;  Parb6;  with  a  discussion  of  characteristics 
that  models  of  teaching  processes  need  to  have  in  order  to  be  accessible  to 
computation  for  best  designs  in  full-scale  teaching  programs  even  when  the 
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backward  induction  technique  is  applied.  The  critical  importance  of  coarse, 
sufficient  partitions  of  the  sample  space  of  the  teaching  models  is  emphasized. 
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2.  Outline  of  the  Theory  of  the  Sequential  Design  of  Experiments 
Sequential  experimentation  is  a  mode  of  carrying  out  statistical  experi¬ 
ments  by  performing  a  sequence  of  subexperiments  with  the  possibility  that 
experimentation  may  be  stopped  at  any  point  in  the  sequence  and  a  terminal  deci¬ 
sion  is  then  made  at  that  stopping  point.  The  theory  of  sequential  analysis 
of  experiments,  many  of  whose  early  developments  are  due  to  Wald  [22],  is 
usually  considered  to  refer  to  the  situation  where  the  identical  subexperiment 
is  repeated  independently  at  the  various  points  in  the  sample  sequence.  The 
extension  of  sequential  theory  to  allow  the  possibility  of  selecting  from  a 
set  of  different  sub experiments  at  each  point  in  the  experimental  sequence  is 
the  distinguishing  feature  of  the  theory  of  the  sequential  design  of  experiments* 
In  addition  to  Wald’s  pioneering  book  [22],  two  general  references  are 
now  available  which  give  detailed  consideration  to  various  aspects  of  the  theory 
of  sequential  experimentation.  Blackwell  and  Girshick  [4]  present  the  founda¬ 
tions  of  sequential  experimentation  and  they  consider  methods  of  solution  for 
best  sequential  strategies  in  a  number  of  specific  situations;  however,  for 
the  most  part,  these  authors  do  not  deal  with  the  sequential  design  problem  as 
they  deliberately  avoid  the  complication  of  sequential  theory  that  the  order 
in  which  subexperiments  are  performed  may  be  important.  In  sequential  games 
involving  teaching  or  learning  experiments  one  will,  of  course,  expect  the 
order  of  experimentation  to  be  important.  Raiffa  and  Schlaifer  ll8]  consider 
a  more  general  representation  of  sequential  theory  which  does  incorporate  the 
possibility  of  performing  different  subexperiments  at  the  various  points  in  an 
experimental  sequence.  In  the  applications  of  this  theory  that  these  authors 
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then  consider,  they,  too,  take  up  only  cases  of  independently  distributed  out¬ 
comes  of  the  subexperiments. 

A  number  of  specialized  papers  on  the  sequential  design  of  experiments 
have  appeared  in  the  literature.  The  so-called  "two-armed  bandit"  problem  and 
generalizations  of  this  problem  are  the  subject  of  several  papers;  Robbins  [19], 
Bradt  and  Karlin  [6]  and  Bradt,  Johnson,  and  Karlin  [7]*  Chemoff  [9]  and 
Albert  [l]  have  considered  some  hypothesis-testing  problems  from  the  standpoint 
of  sequential  design  of  experiments.  DeGroot  [l2]  has  examined  sequential 
design  problems  in  terms  of  various  measures  of  information  in  an  experiment. 

The  design  problems  which  Raiffa  [17]  has  considered  in  his  study  of  item 
selection  procedures  are,  in  terms  of  the  subject  of  study,  perhaps  the  most 
similar  to  the  design  problems  in  teaching  experiments  of  any  sequential  design 
problems  which  have  appeared.  Raiffa  considers  both  sequential  suid  non- sequential 
experiments  in  this  study  which  deals  with  the  selection  of  items  to  develop 
psychometric  tests  and  medical  diagnosis  procedures;  however,  in  this  study  too, 
the  outcomes  of  the  subexperiments  are  assumed  to  be  independently  distributed. 

An  outline  of  the  general  structure  of  sequential  statistical  games  will 
be  sketched  in  the  remaining  sections  of  Part  2.  For  further  detailed  infor¬ 
mation  about  various  aspects  of  the  theory  of  statistical  games,  one  may  wish 
to  consult  in  addition  to  the  general  references  which  have  been  mentioned,  f 
books  such  as  the  following:  re  game  theory,  Luce  and  Raiffa  [ 15]  and  McKinsey 
[l6l;  and  concerning  statistical  decision  theory  at  a  more  elementary  level, 
Chemoff  and  Moses  [  10]  and  Schlaifer  [20  ] . 
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Strategies  in  Statistical  Gaines 

The  principal  objective  involved  in  the  solution  of  any  game  is  the  identi¬ 
fication  of  a  rule  of  play  or  strategy  which  is  in  some  sense  a  "good"  way  to 
play  the  game.  Statistical  game  theory  is  primarily  concerned  with  adapting 
a  special  class  of  games  called  two-person,  zero-sum  games  to  statistical 
decision  problems.  Under  this  adaptation,  nature  is  represented  as  being  one 
of  the  players  and  the  statistician  or  experimenter  is  viewed  as  the  second 
player. 

To  introduce  some  of  the  special  design  problems  that  arise  in  teaching 
experiments,  a  simple  urn  game  will  be  considered.  Let  represent  an  urn 
containing  white  marbles  and  n^  black  marbles  and  be  a  second  urn  con¬ 
taining  m^  white  marbles  and  n 2  black.  Two  players  agree  to  play  the  following 
game  using  these  two  urns.  Player  1  knows  the  identity  of  the  two  urns  while 
the  contents  of  the  urns  are  not  visible  to  player  2.  Player  1  presents  the 
two  urns  to  player  2  in  either  the  order  )  or  and  player  2  is 

required  to  guess  which  presentation  order  is  used  (one  may  interpret  the 
first  position  in  this  pair  as  the  "urn  on  the  left",  UT ,  and  the  second 
position  as  the  "urn  on  the  right,"  UR).  If  player  2  guesses  correctly,  player 
1  will  pay  him  k  dollars  while  if  player  2  guesses  incorrectly  he  is  to  pay 
player  1  k  dollars. 

Consequently,  each  play  of  the  game  results  in  an  exchange  of  k  dollars. 
These  exchanges  may  be  represented  by  payoff  functions.  For  example,  the 
payoffs  that  player  1  will  receive  from  player  2  in  this  urn  game  are  given 


in  the  matrix  shown  below: 
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Player  2's  Choice 


Player  l's  Choice  (U^Ug)  (U2'UP 

(U^Ug) 


This  matrix  gives  all  of  the  values  of  player  l's  payoff  function,  say 
U-^,  for  each  possible  play  of  the  game*  This  game  is  a  zero-sum,  two  person 
game  since  the  payoff  to  player  2  is  the  negative  of  the  payoff  to  player  1. 
Thus  letting  be  player  2's  payoff  function,  the  values  of  this  function  are 
given  by  the  following  matrix: 


Player  2's  Choice 

Player  l's  Choice 

(UjjUg)  (Ugjup 

(u^u2) 

k  -k 

(u2,u1) 

-k  k 

The  two  choices  of  orderings  of  the  urns  (U-^U^)  and  constitute 

the  pure  strategies  for  playing  this  game  for  both  player  1  and  player  2.  It 
would  not  appear  that  there  is  any  choice  of  a  pure  strategy  for  player  1  in  this 
game  which  in. conjunction  with  some  choice  of  a  pure  strategy  for  player  2 
results  in  a  payoff  which  is  a  good  compromise  for  both.  Frequently,  it  is 
necessary  for  one  or  both  of  the  players  to  resort  to  using  more  complicated 
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strategies  called  mixed  strategies  in  order  to  obtain  certain  "good"  ways  to 
play  the  game  or  "good  solutions"  of  the  game.  Mixed  strategies  are  formed 
by  defining  a  probability  distribution  over  the  set  of  pure  strategies  and 
hence  selecting  a  pure  strategy  by  first  operating  a  random  device  which 
appropriately  reflects  the  desired  selection  probabilities.  The  urn  game 
illustrated  here  and  all  of  the  games  considered  in  the  paper  have  the  character¬ 
istic  that  each  of  the  two  players  has  only  a  finite  number  of  pure  strategies. 

A  fundamental  theorem  for  such  finite  games  establishes  that,  when  mixed 
strategies  are  introduced,  each  player  has  at  least  one  good  strategy. 

It  can  be  shown  for  this  urn  game  that  if  each  of  the  two  players  inde¬ 
pendently  employs  the  mixture  of  selecting  the  configuration  (,U  ,Ug)  with 
probability  p  =  l/2  that  this  mixed  strategy  is  a  good  strategy  for  each 
player.  The  expected  payoff  for  either  player  using  this  strategy  is  then  0 
dollars . 

Consider  next  a  modification  of  this  game  in  which  player  2  is  allowed  to 
pay  one  dollar  to  player  1  and,  in  return  for  this  fee,  player  2  is  permitted 
to  take  a  random  draw  of  one  marble  from  either  of  the  two  urns  that  player  1 
presents.  If  player  2  decides  not  to  pay  the  entry  fee,  he  is  still  allowed 
to  guess  which  configuration  obtains,  as  in  the  original  game.  The  pure 
strategies  for  this  game  and  the  payoffs  to  player  2  for  each  pair  of  choices 
of  pure  strategies  are  shown  in  the  matrix  which  follows: 
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Player  2's  Strategy 

[ul,(w,(u1,u2)),(b,(u1,u2))] 

[ul,(w,(u1,u2)),(b,(u2,u1))] 

[Ul,(W,(U2,U1))>(B,(U1>U2))] 

[ul,(w,(u2,u1)),(b,(u2,u1))] 

[ur,(w,(u1,u2)),(b,(u2,u1))] 
[ur,(w,(u2,u1)),(b,(u1,u2))1 
tuR,  (w,(u2,u1)),(b,(u2,u1))] 

(UpU2) 

(u2,u1) 


Player 

(U^Ug) 

k-1 

(k-l)p(w|ui]-(k+l)p[B|ui) 

-(k+l)p(W !  U-j^ )  +  ( k—  1 )  p  ( B  |  U1 } 

-(k+1) 

k-1 

(k-l)p{w|u2)-(k+l)p(3|u2j 

-(k+l)p(w|u2)+(k-l)p{B|u2 

-(k+1) 

k 

-k 


.*s  Strategy 

(u2,u1) 

-(k+1) 

-(k+l)p(w|u2}+(k-l)p(B|u2) 

(k-l)p(wiu2)-(k+l)p{B|u2) 

k-1 

-(k+1) 

-(k+l)p{w|Ui)+(k-l)p{B|u1) 

(k-l)p{w|ui)-(k+l)p(B|ui} 

k-1 

-k 

k 


I 
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The  notation  for  strategies  involving  experimentation;  for  example,  consider 
the  strategy  [U^  (W,  (U^Ug) ),  (B,  (U^/U^) )  ],  is  interpreted- -the  lefthand  urn 
is  to  be  selected  for  a  random  draw; then  if  a  white  marble  is  drawn,  claim  that 
the  correct  configuration  is  (U^U^);  while  if  a  black  marble  is  drawn,  claim 
the  configuration  to  be  (U^U^).  Since  random  moves  have  now  been  introduced 
into  the  game,  one  sees  that  the  payoff  for  certain  of  the  pairs  of  the  pure 
strategies  for  the  two  players  must  be  expressed  as  expected  values  of  the 
payoffs  over  probability  distributions  on  the  random  moves. 

This  second  simple  example  of  an  urn  game  has  served  to  introduce  the  main 
characteristics  of  a  sequential- design- of- experiments  problem.  The  pure 
strategies  for  player  2  are  seen  to  involve  three  important  features:  (l)  a 
decision  concerning  how  much  experimentation  should  be  done,  (2)  a  choice  of 
which  experiment  should  be  performed  and  (3)  the  final  decision  concerning 
what  configuration  of  the  urns  that  player  1  has  presented.  In  more  general 
sequential  design  problems  these  three  components  of  a  pure  strategy  for 
player  2  may  be  identified  respectively  as  the  choice  of  a  sampling  plan,  the 
choice  of  a  sequence  of  subexperiments,  and  the  choice  of  a  terminal  decision 
function* 

This  urn  game  could  be  elaborated  by  allowing  additional  random  draws  by 
player  2*  Most  of  the  sequential  design  problems  that  have  been  considered 
in  the  literature  have  considered  the  situation  where  successive  outcomes  of 
subexperiments  are  independently  distributed.  In  these  urn  games,  the  inde¬ 
pendence  case  would  be  effected  by  making  the  random  draws  with  replacement. 
The  designs  of  teaching  experiments  which  are  considered  in  this  paper, 
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on  the  other  hand,  are  more  similar  to  strategies  for  urn  games  involving 
sampling  without  replacement. 

Normal  Form  of  a  Statistical  Game 

The  descriptions  of  the  two  urn  games  have  included  all  the  necessary 
ingredients  to  characterize  a  game  in  a  mode  called  the  normal  form  of  a 
statistical  game.  Let  U  "be  the  set  of  all  pure  strategies  for  player  1,  and 
V  be  the  set  of  pure  strategies  for  player  2,  and  Mg  be  player  2fs  payoff 
function  defined  on  the  product  set  U  %  V.  The  normal  form  of  this  game  is  the 
triple,  say,  G  =  (lf,V,Mg).  Since  statistical  games  are  zero-sum,  two-person 
games  it  is  sufficient  to  specify  either  Mg  or  in  the  triple  to  uniquely 
define  the  game  G. 

When  mixed  strategies  are  employed  by  each  player,  the  sets  of  strategies 
U  and  V  may  be  expanded  to  include  all  possible  mixtures  of  the  elements  of 
each  of  these  two  sets.  Let  n  represent  a  mixed  strategy  for  player  1  and 
II  be  the  set  of  all  his  mixed  strategies.  Similarly,  let  tj  be  a  mixed  strategy 
for  player  2  and  H  be  the  set  of  all  of  player  2rs  mixed  strategies.  One  defines 
the  mixed  extension  of  the  game  G  to  be  the  triple,,  say,  P  =  (II,  H,  Mg)* 

Statistical  games  are  often  usefully  represented  in  normal  form  to  study 
various  conditions  under  which  good  solutions  to  the  games  exist,  to  examine 
various  relationships  between  certain  classes  of  strategies,  and  to  examine 
other  fundamental  problems  in  statistical  game  theory.  Frequently,  another 
equivalent  representation  of  a  statistical  game  called  the  extensive  form  is 


more  suitable  for  the  purpose  of  actually  finding  specific  solutions  to 
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statistical  games.  The  characterization  of  the  extensive  form  of  a  game  will 
be  deferred  to  Part  5  of  this  report  where  it  is  used  to  obtain  best  designs 
for  several  illustrative  teaching  experiments. 

Sample  Space  of  a  Statistical  Game 

The  features  of  a  sequential-design-of-experiments  problem  which  were 
introduced  in  a  rough,  intuitive  way  through  the  description  of  the  two  simple 
urn  games  will  now  be  formalized  in  order  to  allow  a  general  representation 
of  sequential-design-of-experiments  problems.  Since  statistical  games  typi¬ 
cally  will  involve  the  use  of  experiments  by  the  statistician,  it  is  desirable 
to  have  a  representation  of  all  the  possible  outcomes  of  a  statistical  experi¬ 
ment.  Conventionally,  all  possible  outcomes  of  a  statistical  experiment  are 
represented  as  a  set,  s  a  y  Y,  which  is  called  the  outcome  space  (or  frequently 
the  sample  space)  of  the  experiment.  The  outcome  space  will  be  defined  in 
sequential  design  problems  to  be  rich  enough  to  include  all  possible  experi¬ 
ments  of  interest  and  all  conceivable  outcomes  of  each  experiment.  Although 
the  phrase  "sample  space  of  a  statistical  experiment"  is  often  used  synonymously 
with  "outcome  space"  it  is  also  used  to  represent  a  triple,  say,  Z  =  (Y,ft, p(  •  |o),  e) ) . 
The  components  of  this  triple  are  the  outcome  space,  Y,  a  set  ft  of  parameters  or 
indices  of  probability  distributions  which  are  defined  on  the  outcomes  of  a 

r~ 

particular  experiment  e  ^  Y,  and  a  probability  distribution  on  the  outcomes  of 
a  statistical  experiment,  p{  *  l^e),  which  is  defined  when  a  parameter  point 
a)  €  ft  and  an  experiment  e  C  Y  are  specified. 

Although  this  representation  of  the  sample  space  will  be  suitable  for  the 
development  of  most  sequential  design  problems,  a  number  of  alternative  modes 
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of  representation  of  the  sample  space  could  he  used.  For  example,  one  could 
let  Yg  he  the  restriction  of  Y  or  the  subset  of  Y  consisting  of  all  outcomes 
to  the  experiment  e.  One  can  then  define  the  conditional  sample  space  given 
the  experiment  e  to  he  the  triple  Zg  =  (Yg,  ft,  p{  •  |o)) ) .  These  conditional  sample 
spaces  given  a  particular  experiment  e  are  the  sample  spaces  considered  in  a 
typical  sequential  analysis  problem  where  it  is  necessary  only  to  determine  a 
good  sampling  plan  and  a  good  terminal  decision  function.  The  important  point 
to  note  about  either  representation  of  the  sample  spaces  Z  or  Z^,  is  that  it 
is  necessary  to  specify  both  a  parameter  point  03  and  an  experiment  e  to  define  a 
probability  distribution  on  the  outcome  space.  In  situations  where  it  is  well 
understood  that  a  particular  experiment  is  being  employed,  it  is  conventional 
to  delete  the  subscript  e  from  the  definition  of  the  conditional  sample  space. 

In  order  to  simplify  the  description  of  a  sequential  game  and  the  set  of 
possible  experiments  that  a  statistician  could  choose  from  in  this  game, 
attention  will  be  restricted  to  games  which  will  continue  for,  at  most,  n  steps 
or  trials.  Such  sequential  games  are  called  truncated  sequential  games. 

The  outcome  space  Y  of  a  truncated  teaching  game  will  be  a  set  of  n- 
dimensional  sequences  whose  order  is  determined  by  the  trial  numbers.  A 
notation  which  will  be  used  generally  in  this  paper  to  represent  sequences 
and  vectors  is  the  employment  of  underlined  lower  case  letters.  Thus,  for 
example,  a  representative  element  of  the  outcome  space  Y  will  be  indicated 
as  ~  (y^yg,...^).  The  values  y^  represent  the  coordinates  or  components 
of  £  at  trial  j.  The  parameter  spaces  ft,  which  will  be  cons idered,  will  typi¬ 
cally  be  multi- dimensional  sets; consequently  the  elements  of  these  sets  or 
parameter  points  will  be  similarly  indicated,  i.e.,  * 
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The  same  type  of  notation  will  be  used  to  designate  experiments,  com¬ 
ponents  of  experiments  or  subexperiments,  and  the  set  of  ail  experiments  of 
interest.  Thus  E  shall  be  the  set  of  all  experiments  of  interest  and  the 
elements  of  E  will  be  denoted  by  e.  The" description  of  an  experiment  in 
terms  of  its  component  subexperiments  requires  a  somewhat  more  elaborate 
notational  apparatus.  One  of  the  most  convenient  ways  to  describe  an  experi¬ 
ment  is  through  the  geometrical  concept  of  a  particular  type  of  connected 
graph  called  a  tree.  For  example,  consider  a  sequential  game  truncated  at 
2  trials  developed  in  terms  of  Bernoulli  or  binomial  subexperiments.  In  a 
situation  involving  2  different  binomial  subexperiments  e^  and  e^  (in  the  urn 
games  considered  earlier  e.  could  be  the  selection  of  the  lefthand  urn,  U_, 

l  li 

and  e^  the  selection  of  the  righthand  urn,  UR),  experiments  such  as  the  two 
shown  in  Figure  1  are  possible. 


Experiments 


Subexperiment  at  Trial  2 


Outcome  at  Trial  1 


Subexperiment  at  Trial  1 


Figure  1 
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The  experiment  e^  may  he  characterized  hy  the  rule  that  subexperiment  is 
to  be  used  at  both  trials  hence  the  rule  does  not  depend  on  any  outcomes. 
Experiment  on  the  other  hand,  does  represent  a  rule  which  makes  use  of 
the  information  about  outcomes  at  the  first  trial  of  the  experiment. 

Sampling  Plans  and  Terminal  Decision  Functions 

A  sequential  statistical  game  is  distinguished  from  a  fixed  sample  size 

or  non- sequential  game  by  the  fact  that  sampling  may  terminate  at  any  step 

(perhaps  without  even  starting  experimentation).  The  rules  which  specify 

when  to  continue  and  when  to  terminate  sampling  are  usually  called  sampling 

plans.  Each  sampling  plan  may  be  represented  as  a  partition  of  the  outcome 

space  Y^  into  subsets  called  stopping  regions. 

The  stopping  regions  which  comprise  a  sampling  plan  are  required  to  be 

cylinder  sets  of  a  particular  type.  Letting  b  represent  a  sampling  plan, 

B  the  set  of  all  sampling  plans  and  b^  the  stopping  region  for  the  ith  step 

in  a  sequential  game,  it  shall  be  required  that  if  y.  and  y  are  elements  of 

— j  —  k 

Y  and  the  values  of  the  first  i  coordinates  of  y  are  equal  to  the  correspond- 
ing  values  of  y^  (one  says  in  this  case  that  y^  agrees  with  y^  in  the  first 
i  coordinates),  then  y^cb^  if  and  only  if  ^ie  ind-ex  i  roay  range  over 

the  set  of  integers  0, 1, 2,  ...,n;  if  i  =  0,  one  uses  the  definition  that  all 
ye Yg  agree  in  the  value  of  their  0th  coordinate.  Consequently,  bQ  is  either 
the  entire  set  Y^  or  the  null  set.  A  sampling  plan  b  is  then  a  sequence  of 
cyclinder  sets  b  =  . . . ,  b^)  which  partitions  the  outcome  space  Y^; 

thus,  given  a  sampling  plan,  one  can  tell  for  each  yeY^  the  stopping  region 
in  which  y  is  an  element. 
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In  every  statistical  decision  problem,  the  ultimate  goal  is  to  make  a 
choice  of  on^  among  a  set  of  alternative  actions.  In  sequential  games,  this 
set,  say  A,  is  referred  to  as  the  set  of  terminal  actions.  The  choice  of  a 
good  experiment  and  a  good  sampling  plan  is  made  only  to  improve  one's 
basis  for  choosing  a  terminal  action.  Part  of  the  statistician' s  strategy  in 
a  sequential  game  is  typically  formulated  as  the  choice  of  a  terminal  decision 
function.  For  a  sequential  game  truncated  at  n  steps,  one  may  define  the  set 
I  =  (i:  i  =  0,1,2,  ...,n)  as  the  set  of  possible  stopping  points.  A  ter¬ 
minal  decision  function  is  then  defined  as  a  function  d  whose  domain  is  the 

product  set  I  %  Y  and  whose  range  is  the  set  of  terminal  actions  A. 
n  e 

One  also  constrains  the  terminal  decision  functions  by  a  requirement 
that  if  two  sample  sequences  ^  and  ^  are  elements  of  a  stopping  region  b^ 
and  these  two  sample  sequences  agree  in  the  values  of  their  first  i  coordinates, 
then  the  terminal  decision  reached  for  the  sequence  must  be  the  same  as 
the  decision  reached  for  that  is,  d(i,yj  =  d(i,y  )  =  a. 

j  k 

Cost  Functions  and  Loss  Functions 

Explicit  recognition  is  given  in  sequential  game  theory  to  the  fact  that 
each  additional  subexperiment  which  is  performed  must  in  some  sense  be  paid 
for.  The  cost  of  experimentation  in  truncated  sequential  games  is  represented 
by  a  non-negative,  bounded  function,  say  c,  whose  domain  again  is  the  product 
set  I  X  Y  *  A  restriction  similar  to  that  imposed  on  the  sampling  plans  and 
the  terminal  decision  functions  is  levied  on  the  cost  functions.  Thus,  if 
and  arc  jach  elements  of  a  stopping  region  b^  and  ^  and  agree  in 
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the  values  of  their  first  i  coordinates*  then  it  shall  be  required  that 
c(i,£j)  =  c(i,yk). 

When  the  statistician  finally  stops  sampling  and  decides  upon  a  terminal 
action  a,  then  the  return  to  him  for  making  that  final  choice  will  also  be 
dependent  on  the  values  of  the  parameter  point  that  nature  has  chosen.  It  is 
customary  in  statistical  games  to  express  the  consequences  of  the  statistician's 
terminal  actions  against  each  of  the  possible  choices  of  a  parameter  point  or 
probability  distribution  as  loss  functions.  A  loss  function  in  a  statistical 
game,  say  L,  is  defined  to.  be  a  non-negative,  bounded  function  with  domain, 
the  product  set  HxA. 

In  general  game  theory,  the  consequences  of  a  player's  choices  are  usually 
expressed  in  terms  of  the  values  of  his  utility  functions.  In  statistical 
games,  the  player's  loss  functions  are  defined  to  have  as  values,  the  negative 
of  the  values  of  his  utility  functions. 

Risk  Function 

In  statistical  games,  the  payoff  to  player  2  (the  statistician  or  experi¬ 
menter)  is  expressed  in  terms  of  the  value  of  a  function  called  the  risk 
function.  For  sequentially  designed  experiments,  the  risk  function,  say  p, 
when  only  pure  strategies  are  used  by  both  players  is  defined  as  follows: 
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Representation  of  a  Sequentially  Designed  Statistical  Game 

One  sees  that  the  arguments  of  the  payoff  function  P  for  the  statistician 
in  a  sequentially  designed  game  are  the  pairs  of  strategies,  o>  for  nature  and 
(e,b,  d)  for  the  statistician.  Thus,  formally,  these  statistical  games  can  he 
described  as  the  triple  G  =  (^E^BXD^p)  where  the  parameter  set  ft  represents 
the  set  of  pure  strategies  for  nature;  the  set  of  pure  strategies  for  the 
statistician  is  the  product  set  ExBXD  whose  components  are  the  set  of  all 
experiments  E,  the  set  of  all  sampling  plans  B,  and  the  set  of  all  terminal 
decision  functions  D;  and  the  risk  function  P  is  the  statisticians  payoff 
function. 

Bayes  Principle  and  Bayes  Risk 

One  of  the  principles  that  has  enjoyed  considerable  acceptance  among 
statisticians  as  a  means  for  determining  a  preference  ordering  on  the  set  of 
strategies  available  to  the  statistician  is  Bayes  principle.  This  principle 
asserts  that  the  experimenter  or  statistician  can  designate  a  particular 
probability  distribution  that  nature  is  using  over  the  set  of  parameter  points 
ft  or  pure  strategies  for  nature  on  the  basis  of  his  previous  experience  and 
background  information  available  to  him  prior  to  the  performance  of  any 
experiments.  An  alternative  payoff  function  applies  for  the  statistician 
when  the  Bayes  principle  is  employed.  This  payoff  function  is  called  the 
risk  function  against  the  probability  distribution,  say,  n  over  the  parameter 
set  ft.  The  risk  function  evaluated  for  the  strategy  (e,b,  d)  against  n  in  a 
sequential  design  problem  is  defined  as  follows: 
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P^(e,b,d)^=  ^  ^  ^  |c(i,jr)  + 

i=o  yeb_^ 


jr|cu|p|jr  |(d 


A  Bayes  solution  to  the  design  problem  against  it  is  a  strategy  which  minimizes 

the  risk  function  p^n,  {e,  b^d)^ .  The  value  of  the  risk  function  for  a  strategy 
which  represents  such  a  Bayes  solution  is  called  the  Bayes  risk,  (However, 
frequently  in  the  literature  one  will  see  any  of  the  risk  functions  against 
a  distribution  jt  referred  to  as  Bayes  risk  functions.)  The  probability  dis¬ 
tributions  jt  are  called  a  priori  or  prior  distributions  on  ft. 

An  important  computational  feature  of  Bayes  procedure  for  finite 
statistical  games  of  the  sort  which  characterize  the  teaching  experiments 
considered  in  the  remainder  of  this  paper  is  that  it  is  sufficient  to  consider 
only  pure  strategies  for  the  statistician.  It  is  easy  to  show  that  for  these 
kinds  of  problems  that  some  pure  strategy  is  at  least  as  good  as  any  mixture 
of  pure  strategies  for  every  prior  distribution  it.  However,  even  though  the 
statistician  has  only  a  finite  number  of  the  pure  strategies  (e,b, d),  the 
total  number  of  these  strategies  rapidly  becomes  so  large  with  increasing 
truncation  trial  number  n,that  solution  for  best  strategies  by  sheer  enumeration 
of  the  values  of  p(^n,  (e,b,  d)^)  isnot  feasible. 

The  use  of  Bayes  principle  to  define  "bestnessrr  for  designs  of  teaching 
programs  would  appear  to  be  a  particularly  appropriate  mechanism  to  give 
substance  to  the  notion  of  tailoring  a  teaching  program  to  the  needs  of  an 
individual  student.  The  definition  of  the  Bayes  risk  incorporates  the  concept 
that  the  design  of  a  teaching  program  which  is  to  be  best  for  a  game  involving 
the  responses  of  individual  students  will  be  dependent  not  only  on  the  parameter 
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values  a)  which  govern  the  rate  of  the  student’s  learning  hut  will  also  he 
dependent  on  how  veil  the  experimenter  can  identify  the  student's  learning 
capacity  at  the  outset.  The  experimenter's  probabilistic  classification  of 
students  into  the  various  possible  populations,  preliminary  to  the  teaching 
experiment, may  be  represented  in  the  prior  distributions. 
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3*  A  Stimulus  Sampling  Teaching  Model 
It  has  been  shown  that  in  statistical  games  the  moves  and  outcomes  of 
these  games  are  described  in  terms  of  the  sample  space  of  a  statistical  ex¬ 
periment.  '  In  Part  3>  the  sample  space  of  a  two-concept  teaching  model  that  is 
essentially  the  same  model  which  was  developed  by  Dear  and  Atkinson  [  11  ]  will 
be  described.  However,  it  now  seems  apparent  that  more  general  experiments  must 
be  considered  to  determine  best  sequential  designs  for  teaching  experiments 
involving  the  sequence  of  responses  of  individual  students  than  we  considered 
in  that  earlier  paper.  One  pays  the  price  of  having  substantially  more  difficult 
probability  distributions  of  responses  to  work  with  when  the  more  extensive 
set  of  possible  experiments  is  considered. 

It  will  be  useful  to  review  first  the  mathematical  foundations  of  the 
general  stimulus  sampling  theory  of  learning  in  order  to  see  how  probability 
distributions  are  built  up  in  the  sample  spaces  of  these  models.  The  special 
assumptions  that  are  involved  in  the  single -element,  two-concept  teaching  model 
which  provides  the  setting  for  these  sequential  design  studies  will  then  be 
identified.  Finally,  the  manner  of  constructing  probability  distributions  on 
the  outcome  sequences  of  these  two-concept  models  from  certain  elementary 
conditional  probabilities  and  parameter  values  will  be  shown. 

Mathematical  Foundations  of  Stimulus  Sampling  Models 

Estes  and  Suppes  [l4  ]  have  given  a  formal  representation  of  the  general 
stimulus  sampling  theory  for  simple  learning  situations  as  an  axiom  system. 

Since  the  model  for  teaching  two  related  concepts  that  is  being  utilized  here 
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is  a  special  and,  in  many  respects,  a  very  simple  version  of  stimulus 
sampling  models,  it  will  not  be  particularly  useful  to  review  the  nature  of 
their  general  axioms.  However,  it  will  he  instructive  to  review  their  descrip¬ 
tion  of  the  sample  space  of  these  models  in  order  to  indicate  the  special  re¬ 
strictions  that  are  imposed  on  the  sample  space  in  this  two-concept  teaching 
model. 

The  elements  of  the  sanple  space  of  stimulus  sampling  models  in  general 
are  sample  sequences  defined  on  a  discrete  time  parameter  set  vhich  consists 
of  trial  numbers.  It  is  customary  in  mathematical  learning  theory  to  start 
counting  with  trial  1  so  that  the  time  parameter  set  of  these  processes  is 
usually  the  set  of  all  positive  integers  or  some  subset  of  the  positive  in¬ 
tegers  . 

Estes  and  Suppes  show  that  the  coordinate  at  trial  t  of  a  sample  sequence 
in  these  models  consists  of  an  ordered  6-tuple  of  values,  (C, T, s,  i, j, k) .  The 
first  term,  C,  in  this  expression  denotes  a  conditioning  function  which 
partitions  an  abstract  set  of  stimulus  elements,  say,  S  into  subsets  such  that 
each  element  in  any  one  of  the  subsets  is  conditioned  to,  or  connected  to,  a 
particular  response,  and  the  various  cells  or  subsets  of  the  partition  are  each 
connected  to  a  different  one  of  the  available  responses.  T  denotes  a  subset 
of  S  that  is  chosen  for  presentation  to  the  subject  at  a  given  trial  and  hence 
these  authors  call  T  a  presentation  set  of  stimuli.  The  third  terra,  s,  refers 
to  the  subset  of  T  which  the  subject  samples.  The  conponent  i  refers  to  the 
response  that  the  subject  makes  on  sampling  the  presented  stimuli;  while  j 
denotes  the  outcome  of  the  trial  (receipt  of  food,  avoidance  of  shock,  being 
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informed,  of  correctness  of  response,  etc.)  that  the  experimenter  arranges  to 
follow  the  various  responses.  The  last  value,  k,  in  this  6-tuple  designates 
an  unobservable  reinforcing  event  that  may  occur  and  alter  the  conditioning 
function  at  the  next  trial. 

Single -Element  Stimulus  Sampling  Models 

A  number  of  special  cases  of  the  general  stimulus  sampling  model  have 
been  developed  for  application  to  specific  experimental  situations.  One  of 
the  simpler  versions  of  the  model  that  has  been  applied  extensively  is  the 
single-element  stimulus  sampling  model.  In  this  model,  the  set  of  stimulus 
elements  is  considered  to  consist  of  but  a  single  element.  The  sampling 
axioms  of  the  general  stimulus  sampling  theory  are  usually  modified  in  the 
single-element  version  to  assert  that  the  subject  samples  this  element  with 
probability  1  at  each  trial.  The  modification  of  the  conditioning  function 
from  trial  to  trial  is  then  assumed  to  be  governed  by  a  probabilistic  process 
involving  conditioning  rate  parameters. 

The  sample  space  of  a  typical  single-element  model  consists  of  sequences 
whose  coordinates  at  trial  t  can  be  reduced  to  a  ij— tuple  of  values.  Since 
the  presentation  set  and  the  sampled  subset  in  the  usual  versions  of  s ingle - 
element  models  are  at  each  trial  the  single  stimulus  element,  these  components 
can  be  deleted  from  the  description  of  the  sample  sequences.  Consequently, 
the  description  of  the  coordinate  of  a  sample  sequence  at  trial  t  in  a  s ingle - 
element  model  can  be  reduced  to  (C,i,  j,k)^  —  where  these  four  components  which 
are  retained  are  defined  as  in  the  general  theory. 
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Frequently,  the  first  component  of  the  4-tuple  is  given  in  terms  of  the 
values  of  the  conditioning  function.  For  example,  in  a  simple  two-response 
situation  (i  =  1  or  2)  one  might  let  represent  the  value  of  the  conditioning 
function  —  the  single  element  is  conditioned  to  response  1.  These  values  of 
the  conditioning  function  are  often  referred  to  as  the  states  of  the  conditioning 
function  or  as  the  states  of  conditioning. 

The  sample  space  for  the  particular  single -element  stimulus  sampling  model 
which  will  provide  the  setting  for  the  present  study  of  sequential  design  of 
teaching  experiments  will  be  described  in  detail  in  the  following  section.  For 
further  information  about  the  structure  of  single  element  models,  a  number  of  general 
references  are  available:  see,  for  example  —  Suppes  and  Atkinson  [  21  ] ,  Estes 
[15  ],  Bower  [  5  ]>  and  Atkinson  and  Estes  [  2  ]• 

A  Two-Concept  Teaching  Model 

A  very  simple  model  of  a  teaching  situation,  which  is  at  the  same  time 
complex  enough  to  reflect  some  of  the  branching  problems  that  occur  in  automated 
teaching  experiments,  can  be  defined  in  terms  of  two  related  concepts.  The  two 
concepts  which  are  considered  will  be  labeled  Concept  A  and  Concept  B.  In  the 
language  of  stimulus  sampling  theory  each  of  these  two  concepts  can  be  re¬ 
presented  as  an  abstract  stimulus  element  say  respectively  A  and  B*  Two  types 
of  items  are  considered  to  be  used  in  this  experiment;  they  are  called  items  of 
type  A  and  of  type  B.  The  various  items  in  the  set  of  type  A  items  are  viewed 
as  equivalent  reproductions  of  the  stimulus  element  A  and  a  similar  interpretation 
of  items  of  type  B  as  equivalent  reproductions  of  the  stimulus  element  B  is 
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made.  Consequently;  the  two  types  of  items  may  he  thought  of  as  two  presenta¬ 
tion  sets  A  and  B  and  at  each  trial  either  the  element  A  alone  is  presented  or 
the  element  B  alone  is  presented;  no  other  presentations  of  stimuli  occur. 

On  the  presentation  of  either  element;  it  is  assumed  that  the  presented  element 
is  sampled  with  probability  1. 

There  are  assumed  to  be  only  two  possible  responses  to  an  A  item  or  to  a 
B  item;  consequently;  the  response  index  i  =  1,2,3;  or  4.  However,  it  is  further 
assumed  that  the  responses  are  separated  into  two  disjoint  pairs  such  that,  say, 

1  and  2  are  the  only  responses  available  to  the  subject  on  item  A  trials  and 
3  and  4  are  the  only  responses  available  on  item  B  trials.  Further,  it  will  be 
assumed  that  the  experimenter  wishes  to  have  response  1  conditioned  or  connected 
to  element  A  and  response  3  conditioned  to  element  B  (responses  1  and  3  are 
respectively  the  correct  answers  to  Concept  A  items  and  Concept  B  items). 

The  outcomes  of  each  trial  are  limited  to  two  values.  The  subject  is 
told  either  that  he  has  made  the  correct  response,  say  j  =  1  in  this  case,  or 
he  is  told  that  he  has  made  the  incorrect  response,  j  =  2.  For  simplicity,  it 
is  assumed  these  two  outcomes  have  symmetric  effects  on  the  reinforcement  of  the 
correct  response  and  that  reinforcement  occurs  with  probability  1  at  each  trial. 
Since  reinforcement  is  a  deterministic  process  by  these  assumptions,  it  will 
be  possible  for  this  model  to  delete  the  reinforcement  component  from  the 
description  of  sample  sequences. 

In  defining  the  sample  space  for  this  two-concept  teaching  model  one  could 
use  the  notation  of  the  general  theory  and  designate  the  coordinate  of  a  sample 
sequence  at  trial  t  as  (C^T,  i;j)^..  In  this  expression  the  conditioning  function 
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C  can  take  four  possible  values  C^,  C^,  Cy  or  C^,  the  presentation  sets  are 
either  T=  A  or  T  =  B;  there  are  four  possible  responses  R^,  R^,  R^,  R^,  and 
the  experimenter's  outcomes  j  take  two  values,  Gay,  j  =  1  being  the  outcome, 

"you  gave  the  correct  answer"  and  j  =  2  being  the  outcome,  "you  gave  the 
incorrect  answer." 

The  use  of  this  notation  would  suggest  the  interpretation  that  this  two- 
concept  teaching  model  is  a  two-element  stimulus  sampling  model.  However, 
since  the  four  responses  in  this  model  are  grouped  so  that  only  a  fixed  pair 
are  available  on  A  trials  and  the  remaining  pair  are  available  on  B  trials, 
it  seems  appropriate  to  interpret  this  model  as  consisting  of  two  "linked" 
single- element  processes.  This  interpretation  is  emphasized  by  using  the  follow¬ 
ing  notation  to  define  the  sample  sequences.  For  example,  let  (C^, C^,A, 
represent  the  outcome  at  trial  t  that  the  Concept  A  conditioning  function  is 
in  the  state  (the  stimulus  element  A  is  connected  to  the  correct  response), 

the  Concept  B  conditioning  function  is  in  the  state  C-,  a  Concept  A  item  was 

a 

presented  at  trial  t,  and  the  correct  response,  R^,  was  made  to  the  A  item. 

The  Concept  A  and  Concept  B  conditioning  functions  are  each  assumed  to 
take  two  values.  The  two  sets  of  values  or  states  of  these  two  conditioning 
functions  are  denoted  respectively  C^,cf^  and  Cg,  C^.  The  states  C^  and  C^  can 
be  interpreted  as  states  in  which  mastery  of  the  concepts  has  occurred — that 
is,  when  a  subject  is  in  these  states  the  stimulus  element  A  or  B  is  connected 
to  the  appropriate  responses.  The  states  5^  and  (5f  are  interpreted  as  guessing 
states.  When  a  subject  is  in  these  states  he  may  guess  the  correct  answers 
rA  or  Probabilities,  say,  gA  and  gB  or  may  guess  incorrect  answers,  say, 
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R^Rg.  The  complete  set  of  possible  combinations  of  these  components  that 
may  occur  at  coordinate  t  of  a  sample  sequence  is  the  following: 


{(oA,c'B,A,Sfl)t,(EA,SB,A<EA)t,(5A,5B,B,%)t,(SA, 

^A'^B^'^A^t'  ^l'A,BB,A’BA^t'  ^A’^B’^'^B^t'  ^A' 
<?A'CB’i'SA)t’(5A'CB'A'RB)t’<SA'CB'B>5B)t-<5A' 
*CA'CB’A,®AV(CA'°B'A’Vt’(CA'CB'B’*BV*CA’ 


V'Vt 

V-Vt 

VB’Vt 

CB,B’ Vt 


th 

(The  notational  practice  of  grouping  the  components  of  the  t  coordinate  of 
a  sample  sequence  within  parentheses  with  the  trial  number  as  a  subscript  will 
be  used  generally.  Events  or  sets  of  the  elementary  sequences  will  usually 
have  their  trial  numbers  indicated  in  the  same  way.  Departures  from  this 
notation  will  be  defined  as  needed.) 


Marginal  Distributions  of  Responses  at  Trial  t 

The  states  of  the  conditioning  function  are  typically  not  observable 
aspects  of  stimulus  sampling  learning  experiments.  On  the  other  hand,  the 
distributions  of  the  item  responses  which  are  observable  characteristics  in 
these  experiments,  when  the  states  of  the  conditioning  function  are  given,  con¬ 
stitute  a  set  of  time -independent  Bernoulli  distributions  (in  settings  like  the 
present  problem  which  involves  two  possible  responses).  This  results  from 
the  response  axioms  of  stimulus  sampling  theory. 

To  clarify  and  emphasize  this  point  about  the  response  distributions,  the 
marginal  distributions  of  responses  at  a  particular  trial  t  will  be  described. 
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Letting  R  be  the  set  of  possible  item  administrations  and  responses  at  trial 
t 


t,  i.e., 


and  letting  C.  be  the  set  of  values  of  the  two  conditioning  functions  at  trial 
t 


t,  i.e. 


one  may  then  define  the  marginal  sample  space  say  as  the  triple  = 

(R ,C  ;p(. |c, }).  The  conditional  distributions  on  the  elements  r  of  the 
t  t  t  x 

outcome  space  R  given  the  values  c  of  the  conditioning  function  at  trial  t, 

X  X 

are  defined  for  this  stimulus  sampling  model  as  follows: 


p|rt  -  (A,RA)tl(A)t,ct  (CA,CB^t}  1 


P\rt  =  (A»Vtl(A)t'Ct  =  CB)tJ  =  1 

p{rt  =  (A,RA)tl(A)t,ct  =  (^CB)t}-  =  gA  ,  0  <  SA  5  1 


p{rt  ^A'^V 


t|  “  6A 


4rt  =  <B'Vtl(B)t'Ct  “  (CA>CB)t}  =  1 

p{rt  =  =  ^A,CB^t}  =  1 
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v[rt  =  (B,RB)t|(B)t,ct  =  (CA,5B)t}  =  fife,  0  <  ^  ^  1 

p{rt  =  (B,RB)t|  (B)t,ct  =  (CA,CB)tj-  =  gg 

where  the  probabilities  gA  and  g^  represent  the  probabilities  of  guessing 
correct  responses  to  Concept  A  and  Concept  B  items  when  these  concepts  have 
not  yet  been  mastered* 

For  the  marginal  sample  space  Y^,  it  is  evident  that  the  values  of  the 
conditioning  function  C^  have  been  interpreted  as  parameters  of  the  response 
distribution  at  trial  t.  However,  in  stimulus  sampling  models  the  time -dependency 
properties  of  these  stochastic  processes  are  characterized  principally  through 
the  probability  distributions  defined  on  sequences  of  values  of  the  conditioning 
functions.  Estes  and  Suppes  [ik  ]  have  given  conditions  under  which  sequences 
of  certain  random  variables  will  be  finite-state  Markov  chains*  Frequently, 
it  turns  out  that  the  values  of  the  conditioning  functions  can  be  taken  as 
states  of  a  Ist-order  Markov  chain.  The  manner  in  which  it  seems  necessary  to 
represent  the  set  of  possible  pure  experiments  in  studying  the  design  of  these 
two -concept  teaching  programs  does  not  allow  the  sequence  of  values  of  the 
conditioning  functions  to  be  a  lst-order  Markov  chain.  This  results  because 
the  rules  governing  the  choice  of  presentation  sets  or  items  to  be  administered 
must  allow  for  all  configurations  of  past  histories  of  items  presented  and 


responses  obtained. 
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The  Set  of  all  Possible  Experiment s, 

It  will  be  convenient  for  the  purpose  of  defining  the  set  of  all  possible 
teaching  experiments;  say  E^,  which  the  experimenter  could  perform  first  to 
restrict  the  numbers  of  trials  to  be  allowed  in  a  teaching  program  to  be  at 
most  a  finite  number,  say  n.  (This  is  not  a  practical  restriction  at  all  but 
it  does  impose  some  .mathematical  restrictions,  principally  on  types  of  limiting 
processes  that  can  be  performed.)  The  teaching  program  under  this  restriction 
can  thus  be  regarded  as  a  sequential  statistical  game  truncated  at  n  trials. 

It  is  clear  that  the  rules  for  defining  a  complete  experiment  in  a 
sequential  teaching  program  can  be  based  only  on  the  observables,  the  types 
of  items  that  have  been  administered  and  the  responses  that  occurred  to  these 
items.  For  a  teaching  experiment  involving  two  types  of  items  and  dichotomous 
responses  to  these  items  over  n  trials,  one  will  find  that  E^  consists  of 
2(2n-l)  eXperiinen^s >  All  of  the  component  experiments  in  may  be  described 
by  enumerating  all  the  trees  or  branching  patterns  that  can  be  generated  from 
consideration  of  the  types  of  items  that  may  be  administered  at  each  trial 
(A  or  B)  and  the  responses  to  these  items  (R  or  R^and  Rg  or  ftg) . 


It  may  help,  to  clarify  further  the  concept  of  an  experiment;  to  list 

several  trees  of  experiments  in  this  teaching  model.  For  simplicity’s  sake, 

attention  will  again  be  restricted  to  small  experiments- -consisting  in  this 

illustration  of  3  trials.  Let  R  =  R^xR^XR^  be  the  outcome  space  for  such  a 

3 

3-trial  experiment.  It  is  readily  shown  that  R  consists  of  =  6b  outcome 
sequences.  An  experiment  e^  will  be  a  subset  of  R  in  which  the  responses  at 
trial  3  are  ignored. 
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A  teaching  experiment  can  be  viewed  then  as  a  rule  which  determines  a 

sequence  of  subexperiments  or  item  administrations  over  all  trials.  Such 

rules  cam  be  illustrated  by  branching  patterns  or  trees.  Two  trees  that  define 

two  experiments  for  a  three- trial  learning  situation  are  shown  in  Figures  2  and  5. 

The  first  tree.  Figure  2,  illustrates  a  rule  which  is  conditioned  only  on  the 

trial  numbers.  This  rule  (e..  _)  is,  "Administer  an  A  item  at  each  odd-numbered 

1  j  5 

trial  and  a  B  item  at  the  even-numbered  trial."  The  second  tree,  Figure  J 

illustrates  the  rule  ( "Administer  an  A  item  at  trial  1,  and  administer  a 

*-)  j 

B  item  following  a  correct  response  to  any  type  of  item,  bub  administer  an  A. 
item  following  an  incorrect  response  to  any  type  of  item.” 


Item  Administered  at  Trial  5 


Trial  2 


/Response 


l^Item  Administered 


'Response 


Trial  1 


Item  Administered 


Tree  of  the  Experiment,  j 


Figure  2 
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Item  Administered  at  Trial  3 


Trial  2' 


Response 


Item  Administered 


"Response 


Trial  if 


litem  Administered 


Tree  of  the  Experiment  eQ  , 
Figure  3 


These  two  trees  illustrate  the  complete  prescription  of  what  types  of  items 

are  to  he  administered  at  each  trial  under  the  conditions  of  the  rules  or 

experiments  e^  ,  and  e^  , .  When  the  two  trees  are  extended  to  include  the 

-l> 3  -2, 3 

responses  that  could  occur  at  Trial  3>  there  are  then  8  branches  to  each  tree. 
The  8  branches  represent  the  possible  sequences  that  can  occur  as  elements 
of  the  outcome  spaces,  R  and  R  , which  are  determined  as  restrictions 

-1,5  -2,5 

of  the  outcome  space  R  respectively  by  the  experiments  e  and  e0  The 

—x, }  } 

elements  of  the  outcome  space  R  are  the  eight  sequences 

-1,3 
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( Aj Ra) i_;  (B>Rg)2j  ^A,RA^3 

(A,\)r  (A,RA)?j 

( A-* RA )  1/  (A^R^)^ 

(A’\)l>  (3-> Rg)gJ  ^A,RA^3 


(A,Ra)1,  (B, Rg)^,  (A,RA)? 

(A,RA)r  (B^)2,  (A,Ra)5 

^  1 

(A,Ra)1#  (B^RgJg,  (A,RA)5j 

(a,ra)x,  (b,i^)2,  (a,ra)5 


while  the  eight  sequences  which  comprise  the  outcome  space  of  the  experiment 


%,3 


are 


(A,^a)  1*  ^A*^A^2*  (A'\^3j 

(A,^)^  (A,Ra)2,  (A,Ra)3 

(a,ra)1-’ 

(A,Ra)x,  (A,Ra)2,  (B,Rg)5 


(a>ra)i*  (b^\)2"  ^A,\^3 

(A,Ra)1,  (b,^,  (a,ra)5 
(A>  Ra)^  (BjRg^*  Rg)j 
(A,RA)r  (B^Jg,  (B,Rb)3 


It  is  evident  that  these  rules  for  determination  of  experimental  sequences 

are  deteiministic  or  non- randomi zed  rules  concerning  what  type  of  item  to 

administer  next,  given  the  history  of  item  administ rat ions  and  associated 

responses  that  has  occurred  prior  to  the  current  trial.  It  is  clear  that 

5 

in  the  3  trial  situation  illustrated  here  there  are  2'  =  52  distinct  trees 
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and  hence  32  possible  experiments.  These  experiments  would  represent  all 
the  pure  strategies  that  the  statistician  could  employ  with  respect  to  the 
allocation  of  types  of  items  to  the  various  trials.  Randomized  item  allo¬ 
cation  strategies  can  be  developed  by  taking  mixtures  of  the  pure  strategies. 

Conditional  Sample  Space  of  the  Truncated  Teaching  Experiment;  e, 

K;  n 

For  each  experiment  e,  €  E  ,  one  may  describe  a  sample  space  for  a 

Kj  n  n  x 

truncated  teaching  program  that  will  continue  for  at  most  n  trials.  Letting 
X  be  the  outcome  space  for  such  a  truncated  teaching  program;  one  can  re¬ 
present  this  outcome  space  as  the  product  space 
* 

'■(llV'rhi' 

In  this  definition;  one  sees  that  although  the  observable  item  administrations 
and  responses  are  truncated  at  trial  n,the  effects  of  the  items  administered 
at  trial  n  and  responses  to  these  items  may  be  carried  along  to  modify  the 
distribution  on  the  set  of  conditioning  states  at  trial  n  +  1,  Cq  + 

Let  the  sample  space  for  such  an  experiment  be  called  Z  where 

This  triple  consists  of:  (l)  the  outcome  space  X;  (2)  the  parameter  set  ft 
whose  elements  03  are  vector-valued  parameters  which  govern  the  changes  in  the 
distributions  on  the  conditioning  states  and  the  response  distributions;  and 
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(3)  the  probability  distribution  pj.  |a3,e^  on  the  sequences  x  given  a 
parameter  point  cn  and  the  restriction  of  the  outcome  space  X  by  the  experiment 
-^k,n’ 

It  will  be  convenient  to  introduce  at  this  point  a  notation  to  distin¬ 
guish  the  conditioning  state  components  and  the  observable  components  of  the 
outcome  sequences  x-  The  symbols  that  will  be  used  to  represent  generically 
these  two  sub- sequences  of  components  in  any  outcome  sequence  x  will  be  c  to 
represent  the  sub- sequence  of  conditioning  state  components  and  r  to  represent 
the  sub- sequence  of  item  administrations  and  responses,  thus:  x  =  (c,r). 

Frequently,  the  values  of  the  sequence  of  conditioning  functions  in 
stimulus  sampling  models  will  form  a  Markov  chain;  but  it  has  been  noted  that 
the  manner  of  definition  of  the  set  of  experiments  generally  will  not  per¬ 
mit  a  simple  definition  of  a  lst-order  Markov  chain  on  the  conditioning  states 
in  the  present  problem.  However,  it  is  possible  to  utilize  some  of  the  matrix 
theory  associated  with  the  theory  of  finite  Markov  chains  to  simplify  the 
representation  of  the  distribution  of  response  sequences.  For  this  reason, 
the  vector  of  initial  probabilities  of  being  in  the  various  states  of  the 
conditioning  function  will  be  defined  and  two  matrices  of  probabilities  of 
transition  from  state  to  state  will  be  defined. 

Let  the  vector  of  initial  state  probabilities  be  called  P|,  that  is, 


P^A'Sh}' 
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The  transitions  from  state  to  state  will  be  governed  by  four  condition¬ 
ing  rate  parameters,  say,  Q^,  6^,  G^}  0^.  These  parameters  constitute  the  non¬ 


zero  entries  in  the  two  transition  matrices  which  will  be  called  P^and  P^. 

The  PA  matrix  applies  to  those  trials  where  an  A  item  is  used  and  conversely 

the  P  matrix  applies  to  B  item  trials.  The  structures  of  these  two  matrices 

B 

are  shown  below: 


State  at  Trial  t  +  1 


and 


^A'*V  ^A'^  (CA'CB^ 


P.  = 


State  at  Trial  t 

^A'Cb) 
^ga/Cb^ 


1-0 


A 

0 

0 

0 


1 

0 

0 


0 

0 

1-0 


AB 


0 

0 


AB 


State  at  Trial  t  +  1 

t *\j  v  v 


^ CA,> CB ^  ^CA,CB^  ^CA,CB^  ^CA,CB^ 


State  at  Trial  t 

<ca'=b> 

<°A'CB> 

<CA'CB> 


l-0. 


0  1-0. 

0  0 

0  0 


BA 


1 

0 


BA 


0 

1 
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The  four  conditioning  rate  parameters  may  he  interpreted  as  follows:  the 

parameter  0 A  represents  the  rate  of  transition  from  ^  to  when  Concept  B 

has  not  been  mastered  and  a  similar  interpretation  is  given  to  0^;  the 

parameter  0 ^  represents  the  rate  of  transition  from  to  when  Concept  B 

has  been  mastered;  and  the  parameter  0 1  represents  the  rate  of  transition 

from  to  C  when  Concept  A  has  been  mastered.  Minimal  restrictions  on 
B  B 

these  parameters  are  that  0  <  0^<  Q ^  <  1  and  0  <  0^  <  0^  <  1. 

The  parameter  space  0  will  include  coordinates  that  serve  as  parameters 
of  either  the  response  distributions  or  of  the  total  stochastic  process 
defined  on  the  outcome  sequences  x.  A  representative  parameter  point  o>  will 
be  defined  to  be  a  10- tuple  of  the  following  form: 

-  =  [gA;  SB'  6 A’  0AB'  V  eBA'  ^  ^A’  l}’  ^  CA'  ^B  ^  l}'  ^  ^A'CB^  l}'  ^  CA,(V  l}_  * 

One  can  develop  the  joint  probability  distribution  on  the  outcome  space 
of  a  stochastic  process  such  as  this  teaching  experiment  in  many  ways. 

Since  the  teaching  experiment  is  a  truncated  experiment,  this  joint  distribution 
is  defined  on  only  a  finite- dimensional  domain  and  the  representation  of  the 
joint  distribution  is  straightforward.  In  that  part  of  stochastic  process 
theory  which  deals  with  discrete  state  and  time-parameter  sets  one  usually 
regards  the  individual  random  sequences  as  elementary  events.  In  principle,, 
probabilities  may  then  be  assigned  to  each  elementary  event  in  the  sample  space 
and  probabilities  of  more  general  events  of  interest  are  derived  from  the 
probabilities  of  the  individual  outcome  sequences  . 
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Typically,  it  is  conceptually  too  difficult  to  assign  probabilities 

directly  to  the  individual  outcome  sequences  of  a  process  such  as  this  teaching 

experiment.  The  joint  distribution  will  often  be  constructed  from  certain 

marginal  and  conditional  probabilities.  To  illustrate  the  development  of  the 

joint  probability  distribution  on  the  outcome  space  of  this  teaching  experiment 

a  representative  sequence  will  be  considered.  For  example,  the  following 

sequence  is  an  elementary  event  or  point  in  the  outcome  space  of  the 

experiment  eQ  : 

j 


Consider  then  the  probability  of  this  sequence  given  the  experiment  e0  ,  and 
a  parameter  point  cd  ;  i.e.,  the  conditional  probability 


p[^a^b,a<Ra)i)  ^a'^b’b'V2^  (CA'*B'B'V3'  ^A'CB^_  I  2,3} 


The  confutation  of  the  probability  of  this  sequence  will  be  sketched  below 
and  some  brief  remarks  will  be  made  in  justification  of  various  steps  of  the 
confutation.  From  the  multiplication  law  for  the  probability  of  joint  events, 


one  can  write  that 


9  April  1963 


TM- ll6l/000/00 


c 


P\^CA,CB,A,RA^1>  ^CA;CB,B,RB^2'’  R,RB^3*  ^  %'-2,3| 

=  p|(Rg)y  (\>^b\  1  ^A,^B>A,RaV  @A^B*B,RB^2>  ^A,^B,B'I3,  %*-2,^ 
♦p|(D)5  I  ^A^B^^A^l*  ^CA,CB,B,RB^2j  ^PJ^B^y  V-2,3} 

-P{(V2’  (CA'CbV  ^A> CB' B^2'>  %>-2,^\ 

•p|(b)2I  ^a,CB,A,RaV  ^a^b^  ^-2,3} 

,p{(RA^r  ^CA'CB^  ^A,CB'Ah'  %>-2,^\ 

*p{(A)l  I  (CA'CB^1^  ^b’-2,}\  ^^A'Sh^  ^O’’— 2, 3]"’ 


Consider;  in  order;  the  evaluation  of  the  terms  on  the  right-hand  side  of  this 
expression : 


(Ca,Cb)4I  (C^, CB,A,Rk)v  2,  (ca>CB,B^5'  SJyCj 2 

=  I  ^A’^B’^y  ^-2,3}  ^(^A'^VlJ  ^CA^B,B^3"  ^-2,3} 
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(by  the  assumptions, for  these  stimulus  sampling  models,  that  probabilities  of 
responses  and  next  states  of  the  conditioning  function  are  dependent  only  on 
the  current  state  of  the  conditioning  function,  and  that  conditional  probabili¬ 
ties  given  the  current  state  of  the  condition  function  of  current  responses 
and  next  states  of  the  conditioning  function  are  independent). 


p{(b)5  I  (Vb<a’ra>i-  <SA’SB'B'KB>2'  <Vb>5’  StfSe.i't 

,  pjtBjjl  (A,Ba)1,  (B,Hb)2J  Vi2(3| 


(by  the  definition  of  the  experiment  eQ  x). 
By  similar  arguments  one  has  that 


pfcRgJgA  I  ^A'^B'^Vl'  ^A' CB,B^2'’  ^-2,3} 

=  p{(Rb)2  I  (Vb'B)2, 

^Vl*  ^CA,CB^I  (Ck>^B,kh*  ^-2,3} 

=  ^Vl  I  ^CA,CB,Ah#  ^CA,CB^2^  ^A,CB'AV  ^-2,3} 

and 

p|(b)2I  ^B* A* BA^  l*  ^A'^2'  ^-2,3}  =  I  ^A' Vl'  -o'— 2, 3^  * 
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In  summary,  vhen  these  simplifying  values  are  substituted  into  the 
expression  for  the  conditional  probability  of  the  complete  sequence,  one 
obtains  the  result  that 

^A'^B'13'1^^'  ^A'CB'B,RbV  ^A'^bM  -o}—2,  3} 

=  p{(Rb)5I  (Ca,Cb,B)3,  p{(CA>Cb)4  I  (Ca,Cb,B)3,  <^0,e2^3| 

.p|(B)3l,(A,RA)r 

,pVV2^  <CA'CB'B>2'  ^-2,3}  (CA,CB,B4'  —o’ —2, 3} 

4(B)2  1  (A'Vl'  ^,3} 

•p|(ra)i  I  ^a'^B'A4'  -o' -2, 3}  P^A'^bU  (Ca' CB'A)1'  -o' -2, 3} 

>p{(A)i!  ^-2,3}  P{(1 Wi 1  —o'— 2, 3} * 

To  evaluate  the  probability  of  this  sequence,  one  uses  the  values  given 
explicitly  in  the  parameter  point  a)  and  other  values  implied  by  the  definition 
of  the  experiment  eQ  thus. 
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^3 


■  Sb.o^-V’  1  «B,0  (1-9B,o>  1  eA,0  1 


4,0^0  (l-Vo>  Po{(?A’?B>l[  ' 


This  example  shows  how  the  probability  distribution  on  the  outcome 
sequences  may  be  evaluated  in  terms  of  certain  marginal  and  conditional 
distributions.  The  probabilities  which  will  be  of  chief  interest  in  problems 
dealing  with  the  design  of  teaching  experiments  are  probabilities  of  response 
sequences  and  certain  marginal  probability  distributions  of  the  states  of  the 
conditioning  functions.  Matrix  operators  may  be  defined  which  will  provide  a 
convenient  way  to  compute  these  two  types  of  probabilities  in  a  manner  very 
similar  to  the  matrix  operator  calculation  of  response  probabilities  that  is 
employed  in  linear  models  of  learning  [e.g.  see  Bush  and  Mosteller,  8]. 

Conditional  probabilities  of  the  various  responses,  given  the  current 
state  of  the  conditioning  function,  were  presented  in  the  illustration  of  the 
computation  of  the  probability  of  an  outcome  sequence,  for  example, 

p|(Rg)j|  (C^C^)^,  jf*  Since  toes©  probabilities  are  assumed  to  be 

constant  over  trial  numbers,  they  will  be  collected  into  four  matrices.  Let 
D^,  and  Dfi,  be  the  following  four  diagonal  matrices: 
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da  = 
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(RA>t 

(RA)t 

(RA)t 

^ca’cb,a4 

gA 
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(%)t 
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0 

<CA'V>t 
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<Vb'A4 
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<CA’CB'A)t 
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0 

<Vt 

<Vt 

(Vt 

<V B-B>t 

SB 

0 

0 

(CA’S'B)t 

0 

1_SB 

0 

<Vb’B4 

0 

0 

1 

^ca'cb'b4 

0 

0 

0 

<fyt 

<Vt 

<Vt 

<Vb’b4 

lmh 

0 

0 

(CA‘~B’k)t 

0 

0 

<Vb'b4 

0 

0 

0 

<°A’CB’B4 

0 

0 

0 

(%)t 

0 

0 

0 

1 

(\)t. 
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0 
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0 

<Vt 

0 

0 

0 

1 

(Vt 

0 

0 
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The  principal  diagonal  entries  in  these  matrices  are  the  probabilities  of 
various  responses  given  the  current  state  of  the  conditioning  function  and  the 
type  of  item  administered  as  specified  by  the  row  label.  Off-diagonal  entries 
are  defined  to  be  zero.  The  matrix  gives  conditional  probabilities  of  cor¬ 
rect  response  to  A  items;  the  matrix  gives  conditional  probabilities  of 
incorrect  responses  to  A  items.  The  matrices  and  ^  are  similarly  defined 
for  trials  involving  B  items. 

It  has  been  shown  that  the  conditional  probability  of  a  state  of  the 
conditioning  function  at  the  next  trial  and  a  particular  response  at  the 
current  trial  given  the  conditioning  state  at  the  current  trial  and  the  type 
of  item  that  was  administered  may  be  broken  down  into  a  certain  product  of 
two  conditional  probabilities.  For  example, 

i{<Vy  (Vb> J  l‘*' V'}'  Sfe.%,3} 

One  will  recognize  that  given  the  experimental  rule  e0  ,,the  remaining  parts 

£■>  j 

of  the  specification  of  the  two  conditional  probabilities  on  the  right-hand 
side  of  this  expression  are  contained  in  the  matrices  and  P_.  In  general, 
the  various  sets  of  conditional  probabilities  of  the  type  given  on  the  left- 
hand  side  of  the  above  expression  may  be  computed  as  the  product  of  one  of 
the  diagonal  matrices  with  either  the  or  P^  matrix.  The  resulting  condi¬ 
tional  probabilities  along  with  the  specification  of  the  experimental  rule 
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and  the  parameter  values  0)  provide  the  basic  quantities  needed  to  compute 
the  relevant  probabilities. 

Several  more  matrices  and  vectors  will  be  defined  to  complete  a  basis 

for  convenient  representation  of  the  computation  of  important  probabilities. 

Let  PR  ,  P^  be  matrix  operators  or  transformation  matrices  where 
A  A 


\  ■  P* 


gpSlmtQjO  gA0A 


0 

0 

0 


1 

0 

0 


0 

0 


0 

0 


6a^1‘0ab^  6a0ab 


%  -&APA 


(1-«A)(1-V 

0 

0 

0 


(i-eJe 


A'~A 

0 

0 

0 


0 

0 


0 

0 


1 


(i-BaX^ab)  ^-Sa)0 


Ay  AB 
0 
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prb  db  PB 


%(1“0B)  gB9B 
0 


0 

0 


i-V  0 


0 

0 


1 

0 


gB9BA 


0 

1 


r 


% 


=  ft  P 

B  B 


(i-gB)(i-eB) 

0 

0 

0 


d-eB)«B  0 
(1“6b)(1“0bA^  0 


0 

0 


0 

0 


0 

0 


These  four  matrix  operators  permit  one  to  calculate  the  probabilities  of  the 
states  of  the  conditioning  function  vhich  correspond  to  the  various  branches 
of  the  trees  of  the  different  experiments.  For  example,  experiment  eQ  x  uses 
an  A  item  at  trial  1.  The  joint  probabilities  of  the  states  of  the  conditioning 
function  at  trial  2,  in  conjunction  vith  a  correct  response,  R^,  to  the  A  item 
at  trial  1,  are  computed  by  applying  the  transformation  matrix  PR  to  the  vector 


of  initial  state  probabilities  P^  ;  thus. 
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If  an  incorrect  response  had  occurred  at  trial  1,  the  joint  probabilities  of 

the  conditioning  states  at  trial  2,  in  conjunction  vith  an  incorrect  response 

R.,  at  trial  1, would  be  computed,  by  applying  the  matrix  P~  ■  i.e., 

A  Ra 


*{  ^AjCB^2*  ^CA,CB^2  *  ^A^A^l}  =  ll  ‘ 

L.  —i  ■'A 


Note  that  P0  and  are  not  stochastic  matrices  (PD  and  also  are  not 

ra  ra  h  h 

stochastic  matrices);  consequently  the  vectors  P*  P  and  P*  P^V  do  not  repre- 

'1  Ra  1  Ka 

sent  probability  distributions.  However,  the  sum  of  these  two  vectors  does 
yield  the  conditional  probability  distribution  of  the  states  of  the  conditioning 
function  at  trial  2,  given  that  an  A  item  was  administered  at  trial  1, since 


H  \  *  £i  X  ■  Si  <°A  ♦  V  PA  -  n  1  PA  ■  Pi  PA 


and  P^  is  a  stochastic  matrix. 


Definitions  of  four  additional  vectors  will  be  introduced  here  in  order 
to  simplify  the  computation  of  various  response  probabilities.  Let 


SA 

!>{(®A)tl(SA-cB<A)t} 

1 

= 

p{(BA)t/(CA,?B<A)t} 

gA 

1  ' 

f 

U  _ 

j  p{(RA^t  l(cA’cB,A)t} 
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p{(\)tl(CA^B>A)t| 
p|(\)t  l(cA^CB^A)t| 
p{(Vj(CA’CB'A>t} 

P^^t  ^CA'CB'B)t} 
P^^t  ^CA^B,B^t| 

p{(Vti(VB'B>t} 

p|(RB)t  |(CA^CB>B^t} 


p{(RB)tl(CA,CB,B)t| 
P^^t  ^CA>  CBjB^t} 
p|(^)tl(^  CB>B^t} 
p|(^)t  I  ( ca> CB' B^t} 
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Probabilities  of  sub- sequences  of  responses  along  the  various  branches  of 
an  experimental  tree  may  readily  be  computed  using  these  vectors.  For  example, 
in  the  experiment  e^  ^  the  probability  of  a  correct  response  to  the  A  item  at 
trial  1  is  while  the  corresponding  probability  of  an  incorrect  response 

-1  —A* 

Conditional  probability  distributions  on  the  states  of  the  conditioning 
function  given  a  response  sequence  may  now  be  calculated.  The  conditional 
distribution  of  these  states  at  trial  2  given  a  correct  response  to  the  A  item 
at  trial  1,  for  example,  is  given  by 


/'V  '"O  \ 

^A* CB  2* 


CB^2' 


^CA,CB^2 


-A‘ 


Joint  probabilities  of  the  states  of  the  conditioning  function  with  higher 

level  sub- sequences  of  responses  are  computed  by  applying  the  transformation 

matrix  appropriate  to  the  item  administration  and  response  that  occurred  at 

the  current  trial.  To  illustrate,  the  response  sequence  for  experiment  eQ  % 

d} 

which  involves  only  correct  responses  will  be  considered.  One  obtains  the 
joint  probabilities 


<W3’ 


(CA'^bV 


'CA'  V3 


tRB 


and 


A'B'V 


^CA,CBy V 


)m  car, 


A'CbV 


(ca'cb\ 


(a,ra)i3 
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The  corresponding  conditional  distributions  of  the  states  of  conditioning 
given  these  response  sequences  are  respectively 


4 


(^a>^b) y 


(ca^b^5  > 


=  p’  p 

£1  R. 


P"  fii  PR.— E 


V 


and 


pf 


(CA>CbV  (CA'Cb\'  (CA’Cb\ 


(A,RA)r  (B'V2'  (B^V3 


£i  \  \  V£i  \  \  S»- 


These  matrix  calculations  provided  a  convenient  scheme  for  computing  all 
of  the  probabilities  that  were  necessary  to  determine  best  designs  in  the 
illustrations  that  are  taken  up  in  Part  5  of  this  report.  A  few  additional 
types  of  probabilities  are  required  for  the  determination  of  the  sequential 
designs  in  that  section;  however,  the  examples  of  computations  that  have  been 
given  here  should  be  sufficient  to  indicate  the  general  scheme  for  computing 
probabilities  of  events  in  these  truncated  statistical  games. 
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4.  Objective,  Loss,  and.  Risk  Functions  in  Teaching  Experiments 

The  set  of  states  of  nature  H  which  was  described  in  Part  3  for  the  two- 
concept,  stimulus- sampling  teaching  model  consists  of  multi- dimensional 
parameter  points  whose  components  are  initial  probabilities  of  the  various 
states  of  the  conditioning  functions,  learning  rate  parameters,  and  guessing 
probabilities.  If  the  statistical  game  that  one  wished  to  consider  for  these 
teaching  experiments  were  to  be  concerned  with  estimating  the  values  of  the 
components  of  these  learning  parameters,  the  set  *1  would  be  the  appropriate 
set  of  states  of  nature  or  set  of  pure  strategies  for  nature.  In  a  pure 
estimation  problem,  ft  would  no  doubt  be  represented  as  a  continuous  set — 
probably  a  hypercube  embedded  in  a  multidimensional  real  space. 

Objective  Functions 

More  modest  goals  seem  appropriate  when  considering  the  design  of  many 
teaching  experiments.  Frequently,  one  will  be  more  concerned  with  the 
occurrence  of  events  such  as  the  mastery  of  the  more  difficult  of  the  two  con¬ 
cepts,  or  the  mastery  of  both  concepts,  or  the  occurrence  of  correct  responses 
rather  than  with  obtaining  values  for  an  estimator,  say,  c5  of  the  parameters 
of  the  original  game.  Hence  from  the  original  two-concept  teaching  game 
G  *=  (ft,  EX.BXD,  p)  several  alternative  games  involving  simpler  objectives 
will  be  "extracted”  or  perhaps  it  is  better  to  consider  that  G  will  be  "restricted" 
to  simpler  objectives.  The  expression  of  these  alternative  objectives  can  be 
done  in  several  ways;  conveniently,  one  can  express  these  objectives  either  in 
terms  of  certain  sets  of  events  in  the  outcome  space  X  or  in  terms  of  "objective 
functions"  defined  on  these  sets  of  events. 


f 
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Letting  A  represent  generally  such  sets  or  classes  of  objective  events 
in  X,  one  can  define  the  new  game  restricted  by  A  in  some  instances  as 
G=  (Ajc^ExBX  D;p)  and  in  other  instances  simply  as  G  =  (A,  EX  BX  D,  p)  .  On 
the  other  hand,  letting  c p  be  an  objective  function  defined  on  A,  one  can 
characterize  the  restricted  games  in  terms  of  9  as  either  G  =  (cp(  A)  x  ft,Ex  BX  D,p) 
or  in  some  cases  G  =  (9(  A),  E  X  B  X  D,  p)  where  cp(  A)  represents  the  range  space  of 
<p.  Specific  examples  of  plausible  objective  functions  are  given  in  the 
following  illustrations. 


Mastery  of  both  concept  A  and  concept  B 
In  teaching  situations  involving  the  presentation  of  two  concepts  (or  more 
generally  k  concepts)  one  might  wish  to  design  a  teaching  program  which  in  some 
sense  was  best  for  the  students  mastering  both  concepts.  It  appears  useful  to 
distinguish  two  classes  of  events  relating  to  this  objective: 


let  ^C^Cg)1  =  •••^CA,CB^n+l^CA,CB^n+l} 


where  (C.,C^).  represents  the  event  that  both  concept  A  and  concept  B  were 
A  Bo 

mastered  for  the  first  time  at  trial  t  (t  =  1,2,  ...,n+l)  and  (C^, represents 
the  event  that  not  both  concept  A  and  concept  B  have  been  mastered  by  the  begin¬ 
ning  of  trial  n+1  (the  experiment  having  been  truncated  at  trial  n) , 


also  let  A(C  c  )-  {(CA,CBh^CA,CB^2#,"^CA>CB^n+l^CA,CB^n+l 
'  A7  B  ^ 


^  1 

where  (C^, 0^)^=  t^l^CA^CB^t  rePresents  the  event  that  both  concepts  have  been 
mastered  by  trial  k  (k  =  1, 2, . .  .,n+l) . 


f 
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These  two  classes  of  events  seem  to  be  the  basic  classes  related  to  the 
mastery  of  both  concepts  that  are  worthy  of  consideration.  In  the  language 
of  stimulus- sampling  theory,  mastery  of  concept  A  and  concept  B  means  transi¬ 
tion  into  the  state  for  the  concept  A  conditioning  function  and  transition 
into  the  state  for  the  concept  B  conditioning  function.  The  class  of  events 

A (n  r  \1  is  seen  to  be  a  partition  of  the  outcome  space  X  while  the  class  A. 


A* 


consists  of  the  event  (c^cB)n+1  a  monotone -increasing  class  of  subsets  of 
the  complement  of 

Instead  of  considering  these  classes  of  events,  it  may  be  more  useful  in 
certain  circumstances  to  consider  numerical- valued  functions  defined  on  these 


classes.  For  example,  with  respect  to  the  class  A.  v  it  may  be  desireable 

^A'V 

to  define  an  objective  function  such  as: 


IN 


=  t 


for  t  =  1, 2,  ... ,  n+1 


and 


”(<CA'S>n+l)  *  °- 

For  the  most  part,  the  effects  of  the  restriction  of  the  teaching  games 
by  the  classes  of  objective  events  or  objective  functions  are  reflected  only 
in  the  loss  functions  for  these  games.  That  is,  the  loss  functions  L  in  the 
restricted  games  are  defined  on  AXAor  cp(A)  X  A  instead  of  on  the  product  set 
formed  between  the  parameter  set  ft  of  the  original  game  and  the  action  set  A. 
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Mastery  of  the  more  difficult  concept  B 

Another  type  of  interesting  teaching  objective  vhich  might  be  pursued  con¬ 
cerns  itself  with  achieving  conditioning  of  the  more  difficult  of  the  two 
concepts,  concept  B.  Mastery  of  concept  A  may  or  may  not  be  pursued  as  a 
subgoal  insofar  as  it  promotes  faster  learning  of  concept  B.  Two  classes  of 

objective  events  will  be  defined  for  this  type  of  a  goal.  Letting  (^C-)1  be 

B  t 

the  event  that  conditioning  or  mastery  of  concept  B  occurred  for  the  first  time 
at  trial  t  (t  =  1,2,  ...,n+l)  and  ( ■ , Cjjn+^  be  the  event  that  concept  B  has  not 
been  mastered  after  n  trials,  define  the  following  class  of  events: 


4-f 


1  1 

(*,c  )  (-,cj  , 
b  1'  2 


•  •  (  *  i  C-o) 


B'n+1' 


or  as  an  alternative  definition  of  an  objective  of  this  general  type  one  might 
wish  to  consider  the  class  of  subsets  of  the  outcome  space  defined  below: 


(*,C  )  ~ 


where  (*,0^  =  ^  (-,0^. 


Weighted  response  scores 

An  appealing  and  frequently  used  objective  in  the  development  of  psycho¬ 
metric  tests  consists  of  assigning  numerical  scores  to  the  correct  responses 
and  errors  that  can  occur  on  administration  of  the  items  of  the  test.  Applying 
this  objective  to  the  situation  of  teaching  the  two  concepts  A  and  B,  one  might 
define  random  variables  or  scoring  functions,  say,  W*  as  follows: 
let  W*  ((RA)t)  =  wA,  W*((\)t)  =  -vA 
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«4  w*«vt>  -  v  '"«syt)  *  -  vB 

where  0  <  w^  <  w^ 

The  values  wA  and  w_  could  he  thought  of  as  the  utility  to  the  experimenter  of 
A  n 

correct  responses  respectively  to  concept  A  items  and  concept  B  items.  Conse¬ 
quently,  following  the  statistical  convention  of  using  non-negative  loss 
functions  instead  of  utility  functions,  one  could  define  an  equivalent  score 
function,  say  W,  in  terms  of  losses  as  follows: 

let  W^RA)t)  =  VVA  ’  W((RaV  =  WB  +  WA 

and  W((RB)t)  =  0  ,  W^R^)  =  2vB 

Although  one  could  perhaps  use  such  a  system  of  response  scores  to  develop  an 
objective  for  a  standard  sequential  design  game  involving  teaching  experiments, 
this  type  of  setup  provides  a  basis  for  introducing  a  related  aspect  of  sequential 
game  theory  which  has  not  been  dealt  with  in  the  outline  of  sequential  design 
of  experiments  presented  earlier  in  this  paper.  The  sequential  games  which 
were  outlined  allow  the  experimenter  to  make  use  of  a  sequence  of  subexperi¬ 
ments  to  gain  partial  information  about  nature’s  choices;  finally,  the 
experimenter  makes  a  terminal  decision  and  conceptually  a  payoff  is  then  made 
in  loss  units  at  the  r^d  of  each  single  play  of  the  sequential  game.  Some 
investigations  have  been  made  into  the  theory  for  playing  sequences  of  games 


(see  Luce  and  Raiffa  [15]  for  a  survey  of  various  types  of  situations 
involving  plays- in- sequence  of  games  or  sequential  compounding  of  games). 


< 
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Although  there  are  many  similar  features  found  in  the  theory  of  sequential 
games  and  the  theory  of  sequences  of  games,  it  should  be  evident  that  the 
characterization  of  terminal  decision  functions  especially  will  vary  considerably 
between  the  two  theories.  On  the  other  hand,  questions  such  as  whether  the 
sequences  of  subexperiments  or  sequences  of  games  will  terminate  with  proba¬ 
bility  1  arise  in  both  of  the  two  theories. 

The  theory  of  sequences  of  games  will  not  be  elaborated  here  but  it  is 
important  to  emphasize  that  certain  objectives  and  their  associated  payoff 
functions  in  teaching  experiments  are  best  represented  as  a  sequence  of  games 
rather  than  as  a  sequential  game.  This  distinction  can  be  illustrated  by 
consideration  of  the  simple  urn  game  that  was  discussed  in  Part  2.  An  analogy 
between  that  urn  game  and  the  teaching  game  involving  the  teaching  of  the  two 
concepts  could  be  made  by  representing,  say,  concept  B  as  the  urn  and 
concept  A  as  the  urn  Uc-  Further,  let  the  draw  of  a  white  marble  be  inter¬ 
preted  as  equivalent  to  a  correct  response  and  the  draw  of  a  black  marble  be 
equated  with  an  incorrect  response.  Let  a  payoff  in  loss  units  be  defined  by 
the  scoring  function  W. 

An  example  of  a  sequential  game  involving  the  order  of  presentation  of  the 
two  urns  has  been  given.  Recall  that  player  1  had  two  pure  strategies-- 
present  the  urns  in  the  order  ( U^, )  or  in  the  order  (U^U^).  Player  2's 
terminal  actions  consisted  of  these  same  two  identifications  of  the  order  of 
presentation  of  the  urns.  Payoffs  were  defined  in  terms  of  whether  player  2*s 
choice  of  a  presentation  order  agreed  or  disagreed  with  the  choice  that  player 
1  had  actually  used.  Player  2  was  also  allowed  to  perform  experiments  for  a 
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fee  in  order  to  gain  partial  information  about  the  composition  of  the  two  urns 
before  making  his  terminal  decision. 

As  an  example  of  a  sequence  of  these  urn  games  consider  the  situation 
described  as  follows:  Let  player  1  have  four  urns.  Let  two  of  the  urns  be 
identified  as  concept-B  urns  with  respect  to  the  scoring  function  W  and  two  as 
concept-A  urns  with  respect  to  W.  Let  the  concept-B  urns  be  relabeled  as 
Ubi  and  an(i  suppose  that  the  proportion  of  white  marbles  in  exceeds 
the  proportion  of  white  marbles  in  U^.  Let  the  concept-A  urns  be  similarly 
relabeled  as  and  UA2  and  the  proportion  of  white  marbles  in  be  greater 
than  the  proportion  of  white  in  U^. 

The  first  game  in  the  sequence  proceeds  in  the  following  way:  Player  2 

is  allowed  to  request  a  draw  from  either  a  concept-B  urn  or  a  concept-A  urn 

but  can  only  specify  a  choice  from  one  of  these  two  pairs  and  not  a  specific 

one  of  the  four  urns.  Suppose  that  player  2  selected  the  IL,  urns,  then 

player  1  is  allowed  to  present  either  IL,  or  U  for  sampling.  Player  2  draws 

Bl 

a  marble  at  random  from  the  presented  urn  and  a  payoff  is  made  for  the  first 
game  as  prescribed  by  the  scoring  function  W.  A  second  game  is  now  played 
following  the  same  rules,  only  the  composition  of  the  urn  presented  in  the 
first  game  is  changed  by  allowing  sampling  without  replacement.  A  number  of 
variations  of  rules  governing  the  total  sequence  of  these  games  could  be 
introduced,  e.g.,  a  fixed  number  of  games  could  be  played,  player  2  could 
be  charged  an  entry  fee  for  each  game  and  given  an  initial  purse  to 
gamble  with,  etc.  Roughly,  a  good  strategy  for  player  2  in  these  sequences 


of  games  would  be  the  selection  of  type  of  urn  in  each  new  game,  given  his 
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history  of  selections  of  the  types  of  urns  and  resulting  draws  which  would 
minimize  his  losses  for  the  total  sequence. 

Similar  sequences  of  games  could  be  developed  for  teaching  situations 
involving  two  concepts  and  a  scoring  or  loss  function  such  as  W.  In  these 
situations  the  probabilities  of  obtaining  correct  or  incorrect  response  to 
items  of  the  two  types  would  be  characterized,  however,  by  more  complicated 
relationships  involving  the  learning  rate  parameters  d^}  Q^}  0^,  and  0^  rather 
than  simply  by  the  transitions  determined  by  sampling  without  replacement. 

Good  strategies  for  such  sequences  of  teaching  games  essentially  should 
minimize  the  sum  over  the  sequences  of  the  expected  values  of  weighted 
responses . 

Expected  trial  of  first  reaching  the  state,  (.,0^) 

In  our  earlier  paper  [11]  dealing  with  the  optimal  design  of  teaching 
experiments,  we  defined  an  optimal  strategy  as  one  which  minimized  the  expected 
trial  of  first  mastering  the  more  difficult  concept  B  or  first  reaching  the 
state  (*,0^)  by  appropriate  choice  of  certain  probabilities  of  allocating 
type  A  or  type  B  items  at  each  trial,  given  the  item  and  response  at  the 
immediately  preceding  trial.  In  terms  of  sequential  game  theory,  this  principle, 
of  designing  the  teaching  experiment  to  minimize  the  expected  trial  of  first 
mastery  of  concept  B,is  perhaps  best  represented  as  a  sequence  of  games. 

Clearly,  we  did  not  incorporate  a  sampling  plan  with  its  component  stopping 
regions  into  the  strategies  considered  in  that  earlier  study,  and  consequently 
we  d:’ d  not  incorporate  a  terminal  decision  function  of  the  standard  type  used 
in  sequential  games,  either. 


M  8 
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The  item  allocation  rules  which  constituted  our  basic  strategies  in  that 
problem  were  concerned  only  with  the  choice  of  which  subexperiment  one  should 
continue  with.  One  could  think  of  that  design  situation  as  consisting  of  a 
sequence  of  games  where  player  2  may  select  the  type  of  game  he  wishes  to 
play  at  each  trial  (choice  of  an  A  item  or  a  B  item)  and  he  is  given  the 
information  concerning  which  type  of  game  he  played  at  the  preceding  trial 
and  its  outcome.  The  payoff  to  player  2  could  be  thought  of  as  t  units  of  loss 
at  trial  t  if  the  student  reaches  the  state  (•,0^)  at  trial  t  and  the  payoff 
to  player  2  is  zero  at  trial  t  otherwise.  Consequently,  the  function 

t=i 

an  infinite  sequence  of  these  related  games. 

More  general  objectives 

Several  examples  of  objectives  in  these  two-concept  teaching  games  have 
been  presented.  There  are  a  number  of  other  more  general  objectives  which  are 
much  more  difficult  to  formulate  precisely. 

In  educational  settings  it  is  often  desired  that  the  consequences  of 
insufficient  training  on  a  set  of  stimulus  materials  be  expressible  in  terms 
of  the  subsequent  rates  of  learning  and  performance  on  similar  stimulus 
materials  used  in  related  courses.  To  make  precise  such  evaluations  of  losses 
in  terms  of  degree  of  transfer  of  training  to  other  stimulus  sets,  one  would 
need  to  expand  the  mathematical  model  of  the  teaching  process  to  include  the 
appropriate  parameters  that  govern  the  transfer. 


could  be  interpreted  as  the  experimenter's  expected  loss  in 


€ 
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One  often  hears  it  said  that  a  major  reason  for  attempting  to  regulate 
branching  in  a  teaching  program  in  accordance  with  the  general  abilities  and 
history  of  performance  of  individual  students  is  that  such  branching  proce¬ 
dures  promote  and  sustain  the  student's  motivation  to  do  veil  in  the  teaching 
program.  While  there  probably  is  a  good  deal  of  informal  evidence  to  support 
this  reason  for  incorporating  flexible  branching  rules  in  teaching  programs, 
it  would,  at  this  stage  of  our  abilities  to  measure  motivational  aspects  of 
behavior,  be  the  vaguest  sort  of  speculation  to  discuss  seriously  the  "optimal 
design"  of  teaching  experiments  to  promote  and  maintain  students*  motivation 
levels. 

Loss  Functions 

In  the  preliminary  discussion  of  objectives  and  objective  functions,  it 
was  noted  that  the  effects  of  the  choice  of  alternative  objectives  enter  the 
representation  of  a  sequential  game  through  the  loss  function  of  the  game. 

That  is,  the  set  of  pure  strategies  for  nature  is  restricted  by  the  statement 
of  an  objective  to  either  the  set  of  objective  events  implied  by  the  statement 
of  the  objective  or  to  the  set  of  values  of  an  objective  function  defined  on 
these  events. 

Again,  several  illustrations  will  be  given  to  show,  in  this  case,  various 
plausible  loss  functions  that  one  might  choose  to  use.  For  this  purpose,  it 
will  be  sufficient  to  take  up  only  one  of  the  illustrative  objectives;  the 
objective  of  achieving  conditioning  for  both  concept  A  and  concept  b. 
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Losses  evaluated  in  terms  of  occurrence  of  mistakes 
A  simple,  plausible  loss  function  that  is  often  used  when  the  set  of 
terminal  actions  for  the  experimenter  is  identical  to  the  set  of  states  of 
nature  consists  essentially  of  assigning  a  loss  of  0  when  the  terminal  action 
selected  agrees  with  the  state  of  nature  and  a  constant  positive  loss  is 
assigned  to  all  other  pairs  of  states  of  nature  and  terminal  actions  (mistakes 
or  incorrect  actions).  For  example,  let  the  set  of  states  of  nature  be  given 
by  the  class  of  events  A,  viand  suppose  that  the  set  of  terminal  actions 

A  is  given  by  c  )'  Losses  evaluated  in  terms  of  mistakes  might  be 

defined  in  a  teaching  experiment  truncated  at  n  trials  as  the  following  loss 
matrix  indicates: 


Terminal  Actions 


(CA'CB>1  <ca'cb>; 


(CA'CB>3 


^CA,CB^n+l  ^CA,CB^n+l 


States  of  Nature 


<CA<CB>1 

<CA'CB>2 

<CA«CB>5 


<°A<cB)ntl 

<%>«! 


l 

l 

l 

l 


0 

0 

1 

1 

1 


0 

0 

0 


1 

1 


0 

0 

0 


1 

1 

1 


9  April  1965 


63 


TM- ll6l/000/00 


The  losses  in  this  matrix  are  considered  determined  up  to  a  scalar  multiplier. 
The  value  k  in  the  lower  right-hand  comer  of  this  matrix  will  he  allowed  to 
range  over  the  interval  [0,1). 

This  loss  function  might  express  the  payoffs  in  a  no- data  game;  however, 

it  should  he  evident  that  the  terminal  action  (CA,C.J  should  always  he  at 

a.  .D  n 

least  as  preferable  as  any  of  the  terminal  actions  ( C^, ^  for  all  t  <  n. 

Rather  than  attempting  to  modify  the  loss  functions  to  remove  such  strong 
inherent  determination  of  the  preference  ordering  on  the  terminal  actions, 
frequently  it  will  he  most  suitable  to  impose  restrictions  on  what  values  the 
terminal  decision  functions  may  take  at  various  stages  of  experimentation. 
Illustrations  of  such  restrictions  are  given  in  the  design  problems  which  are 
solved  in  Part  5* 

Absolute  error  and  quadratic  loss  functions 
In  some  teaching  situations  it  might  he  desired  to  evaluate  losses 
associated  with  terminal  decisions  by  more  stringent  standards  which  consider 
not  only  the  occurrence  of  errors  hut  also  the  magnitude  of  the  errors.  For 
example,  let  the  class  of  events  A,n  \1  define  an  objective  and  again  let 
the  set  of  terminal  actions  A  be  identified  with  A,  \1.  Suppose  that  an 

^A' V 

experimenter  expressed  his  losses  in  terms  of  the  absolute  value  of  the 

difference  between  the  trial  number  which  he  claimed  was  the  trial  when  a 

student  first  mastered  both  concepts,  say(CA,C  )^,  and  the  correct  trial 

A  B  t  9 

number  when  first  mastery  occurred,  say  the  absolute  errors 

generally  would  be  defined  as  |t-t*  |.  The  complete  loss  function  defined  in 

terms  of  absolute  errors  might  have  the  structure  shown  in  the  following  loss 
matrix  (perhaps  defined  up  to  a  scalar  multiplier): 
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(CA'Cb)i 

Terminal  Actions 

<°A’CB>2  <°A’CB>3  ••• 

1 

^CA,CB^n+l 

^CA,CB^n+l 

of  Nature 

(cA>  csh 

0 

1  2 

n 

n 

^CA,CB^2 

1 

0  1 

n-1 

n 

(CA>CB)5 

2 

1  0 

n-2 

n 

^CA'CB^n+l 

n 

n-1 

n-2 

0 

n 

<V=bW 

n 

n 

n 

. . . 

n 

m 

The  loss  m, which  occurs  when  a  correct  decision  has  been  made  that  the  student 
had  not  learned  after  n  trials,  would  perhaps  best  be  restricted  by  a  condition 
such  as  0  <  m  <  n* 

In  some  situations  it  might  be  more  appropriate  to  treat  the  errors  whose 
losses  are  shown,  below  the  principal  diagonal  of  this  matrix,  differently  from 
corresponding  values  above  this  diagonal.  For  example,  the  loss  matrix  shown 
below  might  be  more  suitable  than  the  matrix  defined  symmetrically  in  absolute 


errors . 
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Terminal  Actions 


States  of  Nature 
(CA'CB^1 

(CA'CB^3 


(CA'CB^1  ^CA,CB^2  ^CA'CB^3 


0 

1 

2 


0 

0 

1 


0 

0 

0 


^A'S^n+l  ^A'Vn+l 


0 

0 

0 


• 

• 

• 

... 

l 

I 

1 

(CA,CbL+1 

n 

n-1 

n-2 

. . . 

0 

n 

(CA,CbL+1 

_  n 

n 

n 

•  •  • 

n 

m 

Frequently,  it  may  Be  more  appropriate  and  especially  may  Be  mathematically 
more  convenient  to  define  losses  in  terms  of  the  square  of  the  magnitude  of  the 
error  rather  than  in  terms  of  absolute  error.  Loss  functions  whose  values 
are  defined  in  this  manner  are  usually  called  quadratic  loss  functions. 
Obviously,  if  the  individual  elements  in  the  first  of  the  two  illustrative 
loss  matrices  shown  above  were  squared,  the  resulting  matrices  would  constitute 
listings  of  all  the  values  of  two  quadratic  loss  functions. 

Risk  Functions 

The  payoff  functions  in  statistical  games  are  commonly  called  risk 
functions.  Given  a  prior  distribution  n  and  an  objective  A  for  a  game  in¬ 
volving  the  sequential  design  of  a  teaching  program,  and  letting  A.  represent 
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an  objective  event  in  a  class  A  which  defines  an  objective,  a  risk  function 


p  will  be  defined  as  follows: 

n 


rt,(e,b,d)^)  =  y  y  y  ct(r)  +  l(\,d.(t,r)j  p|r I3 ej-«|lj- 


t=l  reb^_ 


where  e  is  an  experiment  (truncated  at  n  trials) 

b  is  a  sampling  plan  consisting  of  stopping  regions 
bt  (t  =  1,2,.  ..,n) 

d  is  a  terminal  decision  function 

r  is  a  response  sequence,  the  observable  components  of  the 
sequence  x  which  comprise  the  outcome  space  X 
0)  is  a  parameter  point 
X  is  an  objective  event 
c  is  a  cost  function 
L  is  a  loss  function 

and  p(r|o),e)  is  a  probability  distribution  of  the  outcome  sequences 

r  e  X  when  a  parameter  point  a)  and  an  experiment  e  are  specified. 

More  general  risk  functions  than  the  above  might  be  required  in  some 
decision  situations  (see  Raiffa  and  Schlaifer  [l8  ]  for  descriptions  of  such 
generalizations)  but  the  definition  of  risk  in  terms  of  additive  effects  of 
sampling  costs  and  terminal  losses  is  predominant  in  sequential  game  theory. 

In  fact,  a  further  specialization  of  the  risk  function  is  commonly  employed 
in  many  studies  of  sequential  games;  the  cost  function  c  is  considered  to 
depend  only  on  the  number  of  subexperiments  used,  t,  and  not  on  the  outcomes 


9  April  1965 


67 


TM-  ll6l/000/00 


of  the  sub experiments  performed  (i.e.  not  on  the  values  in  the  sub-sequences 
(r^r  ,  •  ..r^)).  The  restriction  of  the  risk  function  to  the  case  of  constant 
costs  will  be  used  in  the  development  of  optimal  designs  in  Part  5* 

A  Bayes  solution  for  an  optimal  design  of  a  teaching  program  with  risk 
function  ( e }  b,  d)^  can  be  obtained  by  finding  a  pure  strategy  (e,b,  d)  which 

will  minimize  p  when  the  prior  distribution  it  is  given.  A  mathematical 
programming  technique  for  finding  an  optimal  design  in  accordance  with  the 
Bayes  principle  will  be  elaborated  in  the  next  part  of  this  paper. 
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5-  Solutions  for  Best  Designs  in  Some  Miniature  Teaching  Experiments 
A  technique  vill  be  developed  in  this  section  of  the  report  to  solve  for 
best  designs  (or  best  strategies)  in  teaching  experiments.  Since  the  trees 
of  these  games  rapidly  grow  a  large  number  of  branches  when  the  last  trial  n 
is  even  a  modest- sized  number,  it  will  be  desirable  for  purposes  of  exposi¬ 
tion  to  keep  n  as  small  as  possible.  For  this  reason^ 3-trial  truncated 
experiments  were  chosen  for  examples  as  they  are  just  large  enough  to  allow 
some  interesting  design  features  to  be  revealed  and  small  enough  to  permit 
detailed  graphing  of  the  overall  structure  of  these  experiments. 

Other  severe  simplifications  in  these  examples  are  made  too,  for  the  pur¬ 
pose  of  promoting  clarity  of  exposition  of  the  results.  These  examples 
obviously  are  not  intended  as  serious  efforts  to  design  optimally  any 
experiments  for  specific  teaching  situations  but  hopefully  they  should 
illustrate  a  general  technique  for  solving  for  best  designs.  Among  the  more 
prominent  of  the  further  simplifications  that  are  made  are  (l)  the  restriction 
of  consideration  to  only  two  types  or  populations  of  students  and  (2)  the 
restriction  of  the  response  distributions  to  the  case  of  no  guessing.  Again, 
these  restrictions  were  made  for  simplicity  of  exposition  of  the  optimization 
technique;  the  removal  of  these  restrictions,  for  the  most  part,  has  little 
effect  on  the  procedure  for  solutions  for  best  designs. 

Pure  Strategies  in  a  2-Trial  Teaching  Experiment 

Recall  from  the  earlier  description  of  a  sequential  statistical  game 
that  a  pure  strategy  for  the  statistical!  is  a  triple  (e,b,  d)  where^in  the 
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case  of  a  teaching  experiment,  e  is  an  item  allocation  rule,  b  is  a  sampling 
plan, and  d  is  a  terminal  decision  function.  It  would  appear  that  in  most 
truncated  learning  experiments  appropriate  terminal  decision  functions  should 
take  only  the  value  which  indicates  a  conclusion  that  conditioning  has  occurred 
at  any  trial  prior  to  the  last  trial  of  the  truncated  experiment.  In  many  non- 
truncated  learning  experiments,  it  will  he  appropriate  to  define  the  set  of 
terminal  actions  A  to  consist  of  but  the  single  conclusion  that  conditioning 
has  occurred. 

In  the  several  examples  of  solutions  for  best  teaching  strategies  that  are 

taken  up  in  this  section,  the  objective  which  is  adopted  is  the  teaching  of 

both  concepts  A  and  B.  Thus,  the  set  of  terminal  actions  will  be  defined  to 

consist  of  two  values  a  and  a*^  where 

c  c 

represents  the  conclusion  that  a  subject  is  in  the  state  ( C^, ) 

and 

represents  the  conclusion  that  a  subject  is  in  the  complementary  state 
c 

^CA,CV  =  ( ck>  CB)  tK  CA>  CB)  U  (£a>  cb) 

To  clarify  the  restriction  imposed  on  the  terminal  decision  functions  in  these 
problems--since  these  d*s  are  defined  on  the  product  set  Ir+^  X  X(and  here 
I  ^  is  the  set  of  trial  numbers  tel^  -  [1,2,3,  k))9  the  restriction  on  the 
terminal  decision  functions  is  that  d(t,x)  -  a  when  t  <  n.  For  t  =  n+1, 
d(t,x)  may  in  the  present  examples  take  either  the  value  a^  or  , 

The  tree  of  a  pure  strategy  for  a  3-trial  teaching  experiment  adopting 
the  objective  of  teaching  both  concepts  is  shown  in  Figure  4  below.  In  this 
example  the  item  allocation  rule  or  experiment  e^  ,  is  used  in  conjunction  with 
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into  the  following  sequence  of 


the  specific  sampling  plan  b  and  decision  function  d^  . 

^o  o 

b  partitions  the  outcome  space  X 

Jo  -2,3 

cylinder  sets 

b  ,  =0  (null  set) 

°'J0 


The  sampling  plan 


=  0 


b2,j  =  ^ri*r2*  =  ^A>Vl^B'V2^ 

=  (x:  trp  =  KA,^)^  U  [ri,r2]  =  [  (A,!^,  (B,^] ) 


The  terminal  decision  function  must  take  the  value  a^  for  all  outcome 

o 


sequences  x  which  are  elements  of  b 


2>j, 


The  terminal  decision  function 


further  partitions  b  into  two  subsets.  This  terminal  decision  function  takes 

3.  J0 

the  value  for  that  subset  of  b^  ^  for  which  =  f  Rg)  j) 

and  the  function  takes  the  value  when  its  argument  is  any  of  the  remaining 


sequences  of  b 


Normal  and  Extensive  Forms  of  a  Game 

An  obvious  way  to  solve  for  the  best  strategy  in  this  type  of  sequential 
statistical  game  would  be  for  each  pure  strategy  to  compute  its  risk  against 
a  given  prior  distribution  rt,  and  then  to  pick  out  the  strategy  (or  strategies) 
having  the  smallest  risk.  Unfortunately,  although  there  are  only  a  finite 
number  of  pure  strategies  in  the  games  considered  here,  the  number  of  pure 
strategies  becomes  so  large  so  fast  with  increase  in  the  truncation  number  n 
that  even  with  the  aid  of  large  high-speed  computers,  solution  by  sheer  enumer¬ 
ation  of  risks  is  almost  never  feasible. 
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A  considerable  reduction  in  computational  effort  can  be  gained  by 
representing  these  statistical  games  in  an  alternative  equivalent  manner.  The 
characterization  of  a  game  as  a  triple  G  =  (I,  J,  [i^)  which  was  given  in 
Section  2  is  referred  to  as  the  normal  form  of  a  game.  Essentially,  the  normal 
form  of  a  two-person,  zero-sum  game  consists  of  listing  every  pure  strategy 
for  player  1  (a  pure  strategy  for  player  1  being  a  complete  prescription  of 
what  choices  he  will  make  at  each  of  his  decision  points  in  the  game  given 
all  the  available  information  of  personal  choices  and  random  moves  made  prior 
to  these  decision  points)  in  the  set  I  and  every  pure  strategy  for  player  2  in 
the  set  J.  The  specification  of  the  utility  function  for  player  1,  n  ,  defined 
on  I  x  J  completes  the  description  of  the  normal  form  of  such  a  game. 

The  representation  of  a  game  in  extensive  form  roughly  consists  of: 

(1)  specifying  in  order  each  move  available  at  the  various  stages  of  the  game, 

(2)  identifying  whether  the  move  is  a  personal  move  to  be  made  by  one  of  the 
players  or  a  random  move  and  identifying  which  player  is  to  make  the  choice  for 
personal  moves,  (3)  specifying  the  set  of  alternatives  available  at  each  move, 
(4)  specifying  the  probability  distribution  on  the  sets  of  alternatives  for 
each  random  move,  and  (5)  finally  giving  the  numerical- valued  payoff  to  one 

of  the  two  players  for  each  realizable  play  of  the  game. 

The  extensive  form  of  a  game  may  be  diagrammed  as  a  tree  also.  It  is 

evident  that  the  tree  which  describes  the  complete  extensive  form  of  a  3~trial 

teaching  experiment  will  have  many  more  nodes  or  vertices  than  does,  for  example, 

the  tree  which  illustrates  the  representative  pure  strategy  (e0  ,,  b.  >d,  ). 

o 

However,  the  feature  of  the  extensive  form  which  makes  it  the  more  convenient 
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representation  for  solving  for  best  strategies  in  many  problems  is  that  the 
number  of  branches  to  the  tree  which  represents  the  extensive  form  is  typically 
a  considerably  smaller  number  than  the  number  of  pure  strategies  for  the 
statistician  which  appear  in  the  normal  form. 

Space  limitations  prohibit  the  diagramming  of  the  complete  tree  of  the 
extensive  form  of  this  teaching  game  except  for  a  very  small  number  of  trials. 
In  fact,  it  will  be  sufficient  here  to  illustrate  the  tree  of  a  single-trial 
teaching  experiment.  All  pure  strategies  for  the  experimenter  will  be  de¬ 
rived  from  the  overall  tree. 

Several  additional  notations  will  be  required  to  graph  the  extensive 
form  of  this  teaching  experiment.  Following  common  practice,  nature  will  be 
designated  as  player  1,  the  experimenter  or  statistician  will  be  designated 
as  player  2,  and  random  moves  will  be  assigned  to  an  umpire  or  player  0.  If 
nature  is  in  the  state  at  any  trial,  this  state  will  be  indicated  by 

the  abbreviation  C*.  Conversely,  if  nature  is  in  the  state  U 

(^A'^b)'  ^his  state  will  be  abbreviated  as  2f*. 

The  tree  of  the  extensive  form  of  a  single- trial  teaching  experiment  of 
the  two-concept  type  under  consideration  here  is  shown  in  Figure  5.  From  the 
tree  of  the  extensive  form  of  this  single-trial  teaching  experiment,  9  trees 
may  be  derived  which  represent  all  of  the  pure  strategies  for  the  statistican 
in  this  case.  The  trees  of  these  9  pure  strategies  are  diagrammed  in  Figure  6 


below: 
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Trees  of  the  9  Pure  Experimenter's  Strategies  Derived  from  Figure  5 
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Consequently,  for  this  exceedingly  simple  single-trial  teaching  experiment 
there  are  only  9  pure  strategies  for  the  experimenter  while  a  count  of  the 
number  of  terminal  vertices  of  the  graph  shown  in  Figure  5  reveals  that  the 
tree  of  the  extensive  form  of  the  single-trial  game  has  18  branches.  In  the 
single- trial  case  only,  enumeration  of  the  risks  associated  with  each  of  the 
experimenter 1 s  pure  strategies  would  be  the  simpler  way  to  determine  the  best 
pure  strategy.  One  can  readily  derive  the  following  two  formulas  which  give 
respectively  the  number  of  branches  of  the  tree  of  the  extensive  form  of 
this  teaching  experiment  and  the  number  of  pure  strategies  for  the  experimenter 
in  this  game  as  a  function  of  the  truncation  trial  number,  n: 


Number  of  Branches  to  Tree  of  Extensive  Form,  T^(n) 


T1(n) 


^22(t-l)+l  +  22(n+l)  foj,  n  _  1;2,3,  ... 

t=l 


Number  of  Experimenter's  Pure  Strategies,  Tg(n) 

t2(1)  =  9 

T2(n)  =  1  +  2(T2(n-l))2  for  n  =  2,3,4,... 


i 


The  number  of  pure  strategies  for  the  experimenter  has  been  derived  here  under 
the  assumption  that  the  experimenter  has  perfect  recall  of  all  of  his  earlier 
moves  whenever  he  reaches  another  choice  point.  Frequently,  the  number  of 
pure  strategies  for  a  player  in  an  abstract  game  is  computed  in  a  redundant 
fashion  yielding  much  larger  numbers  of  strategies  than  the  values  of  T^(n) 
given  here  (e.g.,  see  McKinsey  [±6]  for  a  discussion  of  ways  of  counting  all 
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strategies).  However,  for  a  rough  comparison  of  the  computational  efforts 
required  by  sheer  enumeration  of  the  risks  of  each  pure  strategy  to  find  the 
minimum -risk  strategy  versus  the  computational  technique  which  will  be  pre¬ 
sented  that  considers  the  extensive  form  of  the  game,  the  values  of  T^(n)  and 
T^(n)  are  appropriate. 

In  Table  1  the  values  of  T^(n)  and  Tg(n)  are  listed  for  n  =  1, 2, 3,^,5* 

It  is  evident  that  even  t^(n)  increases  quite  rapidly  with  n  but  at  a  rate 
considerably  less  than  t2(n). 

Table  1 


n 

l^n) 

Tg(n) 

1 

18 

9 

2 

74 

163 

3 

298 

53, 139 

4 

1,194 

5,647, 506,643 

5 

4,778 

63, 788, 662, 565, 458, 258, 899 

Solution 

for  Best  Designs  By  Backward  Induction 

A  general  technique  is  known  for  finding  best  designs  for  sequential 
experimentation  when  the  set  of  states  of  nature  ft  and  the  set  of  terminal 


experimenter’s  actions  A  are  finite.  In  this  case,  algorithms  can  be  set  up 
which  with  the  computational  assistance  of  a  large  high-speed  computer  permit 
one  to  solve  for  best  designs  in  a  number  of  circumstances.  Blackwell  and 
Girshick  [4]  refer  to  this  technique  as  "backwards  induction."  Raiffa  and 
Schlaifer  [l8]  suggest  that  the  procedure  might  better  be  called  "averaging 
out  and  folding  back."  The  folding  back  stages  of  the  solution  are  done  in 
accordance  with  the  "principle  of  optimality"  of  dynamic  programming 
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[Bellman, 3]  and  indeed  the  backward  induction  procedure  can  be  viewed  as  an 
application  of  dynamic  programming. 

To  complete  the  representation  of  the  extensive  form  of  the  single-trial 
teaching  experiment  which  has  been  partially  depicted  by  the  tree  in  Figure  5> 
one  must  assign  the  numerical- valued  payoff  to  one  of  the  players  of  each  com¬ 
plete  play  or  branch  of  the  game.  Furthermore,  a  probability  distribution  may 
be  defined  over  the  branches  of  the  tree  by  assuming,  at  the  outset  of  the 
backwards  induction  process,  that  the  conditional  probability  of  the  experi¬ 
menter's  selection  of  each  alternative  at  any  of  his  choice  points  is  1, given 
the  entire  history  along  the  path  leading  to  these  choice  points.  As  the 
folding-back  process  proceeds,  these  conditional  probabilities  of  selection  of 
an  alternative  become  1,  for  the  alternative  chosen  to  be  best  at  a  particular 
choice  point, and  0,for  the  remaining  alternatives  at  that  point. 

The  backwards  induction  technique  will  be  illustrated  by  considering  the 
tree  of  the  single- trial  experiment  shown  in  Figure  5«  The  objective  in  this 
procedure  is  to  find  which  of  the  9  pure  strategies  depicted  in  Figure  6  is 
the  best  strategy  for  a  particular  game.  The  essentials  of  the  computational 
process  in  this  simple  case  can  be  shown  by  establishing  two  tables  of  values. 

In  Table  2.1,  all  of  the  branches  of  the  tree  which  represent  the  utili¬ 
zation  of  either  an  A  item  or  a  B  item  experiment  are  listed  (each  of  these 
branches  may  be  identified  as  including  5  nodes  or  vertices).  Along  with  each 
play  or  game  sequence,  the  probability  of  the  sequence'  is  listed.  The  payoff 
of  each  sequence  or  play  of  the  game  to  player  2  is  listed  in  column  3  of 
Table  2.1  as  the  loss  to  player  2.  For  the  moment,  the  values  of  these 
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probabilities  and  losses  may  be  considered  to  be  given  numbers.  The  basis 
for  derivation  of  these  particular  values  will  be  indicated  later. 

The  16  sequences  in  Table  2.1  have  been  grouped  there  into  8  pairs.  Each 
pair  represents  the  choice  of  one  of  the  2  alternative  terminal  actions  a^  or 
a^  when  one  of  the  4  experimental  outcomes  (A,?f^),  (A,R^),  (B,?^),  or  (B,Rg) 
has  occurred.  Within  each  of  these  8  pairs,  player  1  or  nature  has  chosen 
the  state  3*;  however,  player  2  does  not  have  precise  knowledge  of  which  state 
player  1  has  chosen  but  only  knows  the  probabilities  of  each  sequence.  The 
losses  to  player  2,  if  he  chooses  or  a^  at  "Trial  2"  when  nature  chooses 
2f*  or  C,  are  the  values  given  in  column  3*  Since  player  2  does  not  know  the 
precise  loss  he  will  incur  from  taking  either  of  his  terminal  actions  in  view 
of  any  of  the  four  experimental  outcomes,  he  computes  his  expected  loss  over 
the  probabilities  given  in  column  2.  The  resulting  8  values  of  expected  losses 
are  given  in  column  4.  This  phase  of  the  process  represents  an  "averaging  out" 
stage. 

On  the  basis  of  the  expected  losses,  the  best  actions  at  the  "Trial  2" 

level  of  the  game  are  now  determined  for  each  of  the  4  possible  experimental 

outcomes  that  could  occur.  For  example,  if  a  type  A  item  had  been  administered 

and  incorrect  response  ?f  occurred,  then  the  expected  loss  of  taking  the  action 

a*\,  is  .26875  while  the  expected  loss  of  taking  action  a  in  this  circumstance 
c  0 

is  .05.  Consequently,  the  best  action, or  the  action  which  minimizes  the 
expected  loss  in  this  case, is  .  The  4  best  actions  at  "Trial  2"  in  this  game 
are  listed  in  column  5  Table  2.1. 

The  sets  of  sequences  representing  the  best  actions  at  "Trial  2"  are  now 
"folded  back"  and  considered  along  with  terminal  actions  permitted  at  Trial  1 
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in  order  to  complete  the  determination  of  the  best  pure  strategy  for  player  2. 
In  Table  2.2 ,  the  events  or  sets  of  sequences  which  correspond  to  the  best 
actions  at  "Trial  2"  are  listed  in  column  1  along  with  the  only  additional 
terminal  action  permitted  at  Trial  1,  a  .  The  probabilities  of  these  events 
are  given  in  column  2  of  Table  2.2  (the  first  four  values  of  this  column  are 
carried  along  from  Table  2.1  while  the  last  two  values  may  be  considered  to 
be  given  numbers). 

In  the  third  column  of  Table  2.2.,  losses  to  player  2  associated  with  each 
of  the  6  events  are  listed.  The  first  four  values  of  this  column  are  the 
expected  losses  of  these  best  four  actions  at  "Trial  2."  The  fifth  and  sixth 
values  in  this  column  are  the  given  values  of  taking  the  terminal  action  a 

c 

when  respectively  nature  chooses  the  state  5**  or  C*. 

The  6  events  in  Table  2.2  have  also  been  grouped  into  pairs.  Within  each 

of  the  three  pairs,  it  is  again  necessary  to  "average  out"  the  losses 

associated  with  the  choices  of  player  2,  In  the  case  of  the  pairs  (A., 

( A,  R . ,  a  )  and  (B,SL,a  ),  (B,R_,a  ),  the  expectations  are  computed  over  the 
Ac  be  be 

probability  distributions  of  the  responses  R^and  R^R^.  The  expected  loss 
of  the  two  expected  losses  from  the  "Trial  2"  level  associated  with  the  choice 
of  the  item  A(or  B)is  computed  by  merely  summing  the  two  higher  level  expected 
losses.  The  expected  loss  of  the  action  a  at  Trial  1,  is  computed  by 
averaging  the  losses  of  that  action  given  in  column  5  over  the  corresponding 
probabilities  given  in  column  2. 

The  best  pure  strategy  for  this  particular  single- trial  teaching  game 
may  now  be  identified  as  the  one  of  the  three  strategies  which  has  the  minimum 
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expected  loss  or  risk.  The  three  expected  losses  in  question  are  given  in 
column  4  of  Table  2.2.  The  strategy  with  minimum  risk  is  seen  to  be  the 
strategy  which  includes  the  two  sets  of  branches  ( ac ) (J ( A,  a^ ) .  The 
tree  of  this  best  strategy  is  tree  (4)  in  Figure  6. 


/ 
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Table  2.1 


Branch  or  Probability  Loss  to  Player  2 
Sequence  of  Sequence  Associated  vith 

Sequence 


(  ac  J  c*) 

.025 

.85 

(A'\'VC*) 

.225 

1.10 

(A,]?A,ac,C*) 

.025 

1.10 

(A,^A,ac,c*) 

.225 

.10 

(A,RA,ac,?T*) 

.25 

.85 

(A,RA,ac,c*) 

.50 

1.10 

(A,RA,ac,C*) 

•25 

1.10 

(A,RA,ac,C*) 

.50 

.10 

(B^B,ac,^) 

.05 

.85 

( B; Sfg >  a^,  C*) 

.20 

1.10 

(B^a^C*) 

.05 

1.10 

(B^a^C*) 

.20 

.10 

(b,rb,«cc,?t*) 

•25 

.85 

(B,RB,ac,C*) 

•  50 

1.10 

(B,RB,ac,C») 

.25 

1.10 

(B> Rg>  C*) 

•  50 

.10 

Expected  Loss  of  Best  Action  at 

Terminal  Action  "Trial  2" 

at  "Trial  2"  to 

Player  2 _ 


.26875 


ac} 


.05 


.7625 


(A,RA,ac) 


•325 


.2625 


(B,VC) 


•  075 


.7625 


(B>Rg>ac) 


.325 
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Table  2.2 


Event 

Probability  of 
Event 

Loss  to  Player  2 
Associated  with 
Event 

Expected  Loss  of 
Pure  Strategy  to 
Player  2 

Best  Pure 
Strategy 

(A,Vc> 

.25 

.05 

.375 

(A,\,ac) 

u 

<A>V*=> 

.75 

.325 

(A,RA,ac) 

<B’V0> 

•25 

.075 

.40 

<B>EB’ac) 

.75 

•325 

(*c>) 

.50 

.1 

•  50 

<V°V> 

•50 

0 
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Examples  of  Best  Designs  for  3-Trial  Teaching  Experiments 

The  single-trial  example  of  a  teaching  experiment  illustrated  the  general 
form  of  the  backward  induction  computational  process  but  it  was  such  an 
abbreviated  example  that  it  did  not  permit  a  very  interesting  look  at  item 
allocation  problems.  Furthermore,  it  will  be  desirable  to  explore  the  effects 
that  alternative  loss  functions  and  prior  probability  distributions  on  the 
parameters  have  on  the  structure  of  best  sequential  designs.  For  these  reasons, 
five  examples  of  best  designs  for  3-trial,  two-concept  teaching  experiments 
were  computed.  The  results  of  these  computations  are  described  in  the  re¬ 
mainder  of  Section  5 • 


Description  of  the  alternative  populations  of  students 

The  role  of  prior  probability  distributions  on  the  parameter  set  ft  may  be 
introduced  by  reference  to  several  classes  or  populations  of  students.  Thus 
each  point  o  e  ft  will  be  viewed  as  determining  a  population  of  students.  It 
will  be  sufficient  for  the  objectives  of  these  examples  to  restrict  ft  to  con¬ 
sist  of  two  values,  say,  ox^  and  to  The  components  of  were  given  values 

which  might  be  considered  representative  of  a  relatively  slow- learning  popu¬ 
lation  of  students, while  the  components  of  were  given  values  that  might 
represent  a  relatively  fast-learning  population  of  students.  Recall  that  each 
parameter  point  co  in  the  two-concept  teaching  process  under  consideration  has 
10  components,  i.e.> 


0)  = 


gA,6B'  SA'  9AB’  QB* 
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In  each  of  the  five  design  problems  which  follow,  these  values  were  chosen  to 
represent  the  population  of  slow- learning  students,  u^,  and  the  population  of 
rapid-learning  students,  : 


0,0,  .to, .50,  -io, .20, .50, .25, .25,0 


and. 


0,0, .80, .90, .60, .80,0, .25, .25, . 50J 

The  choice  of  the  value  0  for  the  probabilities  of  guessing  correct 
responses,  g^  and  g^,  was  made  in  order  to  make  the  resulting  best  strategies 
somewhat  more  intuitively  clear.  This  choice  has  relatively  minor  effects  on 
the  nature  of  the  computations.  Values  of  the  other  components  of  these  two 
parameter  points  were  picked  so  that  the  probability  distributions  on  the 
sequences  of  these  two  processes  would  be  well  separated.  Furthermore,  the 
values  of  these  components  were  also  deliberately  chosen  at  levels  which  would 
allow  the  effects  of  sequential  experimentation  to  show  up  even  though  the 
experiments  are  truncated  at  the  small  number,  3  trials. 


Computation  of  probabilities  of  sequences  in  the  extensive  form 
The  matrix  operator  computational  apparatus  which  was  outlined  in  Section  3 
offers  a  relatively  simple  method  for  computing  the  probabilities  of  the  sequences 
required  for  the  backward  induction  solution.  The  two  parameter  points  and 
each  provide  the  values  necessary  to  define  a  vector  of  initial  state 
probabilities,  two  matrices  of  transition  probabilities  (one  for  item  A  trials 
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and  one  for  item  B  trials )»  and  four  diagonal  matrices  of  conditional  response 
probabilities.  That  is,  given  aao^  (h  =  1,2)  the  following  matrices  are  defined: 

-l,h'  PRa  h  =  DA,hPA,h,  =  °A,hPA,h, 

\h  ’  V**’  \h  =  V  V' 

In  addition  to  these  matrices,  two  vectors  which  will  determine  the 
probabilities  of  nature's  being  respectively  in  the  states  C*  =  anc* 

=  (S’a.jS'b ) U( CA, ^B) U(^A> CB)  Will  be  defined;  let 

U'  =  [0,0, 0,1] 

and 

[1,1, 1,0]. 

Finally,  let  be  the  vector  of  prior  probabilities  that  nature 

is  in  the  state  or  a^. 

A  representative  pair  of  plays  from  the  extensive  form  of  this  3- trial 
teaching  experiment  will  be  considered.  For  example,  the  following  two  plays 
or  sequences  are  representative  of  the  class  of  longest  plays  in  this  game: 

[(A, Ra)^  ^}^p^27  y  V 

t(A'Vi>  <a'V2'  (B»Vy  (acV  (C*V 


If  it  were  certain  that  a  student  whose  responses  were  described  by  the  above 
sequences  was  from  the  population  earthen  the  probability  of  the  first  sequence 
would  be  given  by 


p'  pv  p  p  tV 

-1,1  Ra  ,  Ra  .rR_  > 
7  A,  1  A,  1  B,  1 
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while  the  probability  of  the  second  sequence  would  be 


P  P  p  u  . 

1,1  \l  \l  Vl 

On  the  other  hand,  if  it  were  certain  that  the  population  obtained,  the 
probabilities  of  the  first  and  second  sequences  would  be  given  respectively  by 


p  •  P:y  p  p  £ 

1,2  A,  2  RA,2  2 


P'  P  P  P  U  . 

-^2  \,2  V  \2~ 

These  computations  represent  the  situations  where  the  prior  distributions 


=  [l,0]  and  *.!  =  [l,0]. 

X1  12 

For  an  arbitrary  prior  distribution  vector  n'  =  [  n  }  «  1  marginal 

lj  i  2,  i 

probability  of  the  first  play  is  obtained  by  computing  the  weighted  sum 


*  (P'  Pcy  P  P  PR  PR  jJ  ) 

l,i\  1,1  Ra^1  /  2>A  1>2  %2  RA,2  RB,  2  / 

and  the  probability  of  the  second  play  is  obtained  by  computing  the  weighted  sum 


■US.1  V  \  A,2  \8  2  )■ 

These  illustrative  computations  should  be  sufficient  to  indicate  the 
general  scheme  for  computing  the  probability  of  each  play  in  the  extensive 
form  of  this  game.  It  should  also  be  evident  from  these  illustrations  that 
the  extension  of  this  game  from  the  case  of  2  populations  to  m  populations 
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should  not  increase  the  computational  effort  beyond  feasible  bounds  if  m  is 
not  too  large  (perhaps  in  the  range  10  to  25). 

Description  of  the  cost  function 

In  these  examples  of  solution  for  best  designs,  the  simple  case  of 
constant  cost  of  experimentation  for  each  trial  will  be  considered.  It 
was  found  that  a  cost  of  .1  unit  per  trial  was  suitable  to  reveal  the 
effects  of  sequential  experimentation  in  these  examples.  Since  the  costs  and 
losses  in  these  problems  need  to  be  measured  along  some  common  scale,  for 
the  purposes  of  these  illustrations,  a  scale  will  be  used  whose  unit  will 
arbitrarily  be  called  a  utile. 

Thus,  the  complete  cost  function  used  in  these  examples,  is  defined  by 
the  following  table: 


t  12^4  . 

c(t)  0  .1  .2  .3 

Best  designs  when  losses  are  expressed  symmetrically  in  terms  of  errors 
The  basic  loss  function  which  will  be  considered  throughout  the  five 
examples  to  follow  is  the  loss  function  common  to  two-action  decision  problems. 
In  the  statistical  literature  the  most  frequent  case  of  such  a  two-action 
problem  is  the  testing  of  a  simple  hypothesis  against  a  simple  alternative. 

The  restrictions  on  the  terminal  decision  functions  which  have  been 
imposed  on  these  teaching  experiments  lead  to  the  determination  of  two 
loss  matrices.  At  the  level  of  the  terminal  decisions  for  "trial  n  +  1" 
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(in  these  examples,  "Trial  4")  the  two  actions  and  a~  are  permitted.  Con- 
sequently,  at  this  level  the  following  loss  matrix  obtains: 

State  of  Nature 

C* 

ar* 

On  the  other  hand,  only  the  terminal  action  ac  has  been  chosen  to  be  permissible 
at  trials  1,2,  .  ..,n  in  this  teaching  experiment;  thus,  the  loss  matrix  which 
is  applicable  at  these  trials  is  the  first-  column  of  the  above  matrix. 

I  Two  examples  of  best  teaching  strategies  were  computed  using  this  familiar 

loss  function.  The  principal  motivation  in  solving  for  these  two  best 
strategies  was  to  highlight  the  fact  that  one  must  consider  the  characteri¬ 
zation  of  the  experimenter* s  losses  very  carefully  otherwise  one  will  arrive 
at  strategies  which  may  be  best  in  terms  of  minimizing  the  specified  risk 
but  hardly  could  be  considered  best  in  terms  of  representing  an  acceptable 
teaching  strategy. 

Example  1:  Best  Design  for  a  Rapid  Ledrners  Population 

In  this  first  example,  the  prior  distribution  nT.  =[0,l]  was  assumed.  The 

12 

parameter  points  and  and  the  constant  cost  function  c(  . )  with  increment 
.1  utile  are  used  throughout  all  five  examples.  This  example  and  the  next 
example  assume  the  loss  function  which  has  been  described  above. 

To  evaluate  the  payoff  of  each  play  in  the  extensive  form  of  this  teaching 
f  experiment  to  player  2,  one  must  consider  both  the  sampling  cost  and  the  terminal 

loss  associated  with  the  play.  These  components  of  the  payoff  can  be  collected 


Terminal  Action 
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into  four  matrices,  say,  L^(l),L^(2), 1,^(3)  and  L^(4)  which  correspond  to  the 
sets  of  plays  which  respectively  use  0  trials,  1  trial,  2  trials,  and  3  trials. 
These  matrices  are  defined  below: 


while 


a  a'v 

c  c 

•  3  1-3 

C* 

Ll(4)  = 

1.3  -3 

a 

a 

a 

c 

c 

c 

.2 

C* 

.1 

c* 

0 

Lx(3)  = 

1.2 

~  L.(2)  = 

C*  1 

1.1 

-v  L  ( l)  = 

C*  1 

1 

C* 

'V 

C* 


The  expected  payoff  to  the  experimenter  of  the  representative  pair  of 
plays,  [(A,Sa)1,  (A,Ra)2,  (ajv  (C*)^]  and  [(A,^)^  (A,Ra)2, 

(B,Rb)^,  (ac))+,  (C*)^]  is  obtained  by  computing: 


1-3 


A,  2  A,  2 


V 


Sr 


/  . 

.3  (  p-  ,  Kb 

V  « 


A,  2  A,  2 


V  H 


This  computation  represents  a  typical  "averaging  out"  calculation  required  in 
the  course  of  the  backwards  induction  process.  It  should  be  evident  that 
these  computations  would  be  only  slightly  more  complicated  if  an  arbitrary 
prior  distribution  vector  had  been  assumed.  The  expected  payoff  would  then  be 
computed  as: 


1.5 


^ 


+ 
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i. 


l"i,iV- 1,1  Tf, 


A,  1  A,  1  B,  1 


U  1+  jt„ 


[2,iV£l,2 


..M 


A  ,  2  A,  2 


PR  “  I- 

^B,  2  ■/J 


A  computational  procedure  for  solving'  for  the  best  strategy  in  these 
examples  vas  set  up  and  carried  out  on  a  desk  calculator  since  the  computational 
effort  required  did  not  merit  the  development  of  a  computer  program.  The 
resulting  best  design  for  this  population  of  students  and  this  familiar  loss 
function  is  graphed  in  Figure  7 •  Two  features  of  this  design  require  some 
discussion  although  in  general  the  design  seems  to  represent  a  "reasonable" 
teaching  strategy. 

The  branches  l(B,?yr  (%)J  and  t(B^)p  (B,^)^  (b,^)^  (a^] 

of  the  tree  of  this  pure  strategy  may  appear  to  represent  unusual  terminal 
actions  at  first  glance,  but  inspection  of  the  set  of  particular  values  of 
initial  state  probabilities  in  this  example  confirms  the  appropriateness  of 
these  actions.  Since  g^  =  0  in  these  examples,  the  response  at  Trial  1 
indicates  that  this  student  must  have  initially  been  in  the  subpopulation 
whose  initial  state  of  conditioning  is  (CA,CB)  (the  initial  probability  of  the 
only  other  applicable  state  C^)  having  been  assumed  to  be  zero).  Therefore, 
when  the  correct  response  occurs  at  trial  2  or  3,  this  is  errorless  evidence 

that  the  student  is  now  in  the  desired  state  (Ch.C  ). 

A 9  B 

The  branch  [(B,R^)^,  (A, (a^)^]  of  this  tree  will  be  found  to  be 
a  situation  where  there  is  some  probability  of  being  in  error  by  taking  this 
terminal  action  in  the  face  of  the  evidence  from  the  responses  (R^)^  and  (R^) 
but  the  probability  of  making  an  error  is  so  small  that  it  does  not  pay  to 
use  another  trial.  The  branch  [  (B,^,  (a^]  represents 


9  April  1965 


95 


TM- ll6l/000/00 


a  circumstance  where  it  paid  to  use  all  3  trials  but  where  it  is  now  best  to  take 

i 

the  terminal  action  at  "Trial  4"  since  the  probability  of  the  students 
being  in  the  state  (CA>CB)  at  "Trial  4"  is  sufficiently  high. 

Example  2:  Best  Design  for  a  Slow  Learners  Population 

The  only  difference  between  Example  2  and  Example  1  is  that  the  prior 

distribution  vector  *  =  [l.O]  is  used  instead  of  the  vector  n,  .  That  is, 

il  _12 

in  the  present  example  one  assumes  at  the  outset  the  student  is  definitely 
from  the  slow- learning  population  instead  of  from  the  rapid-learning  population. 

The  tree  of  the  best  design  for  this  example  is  given  in  Figure  8. 

Although  this  is  the  best  strategy  for  this  particular  population  under  the 
conventional  loss  function  being  employed;  most  educators  would  probably 
question  the  "bestness"of  this  strategy  from  the  standpoint  of  appropriate 
teaching  objectives.  One  might  describe  this  strategy  as  a  "defensive  teacher's 
strategy "--the  apparent  objective  being  to  give  the  hardest  possible  sequence 
of  items  in  order  to  conclude  with  minimum  risk  that  most  individuals  from 
this  population  did  not  learn  the  required  concepts. 

This  anomalous  result;  it  can  be  shown,  occurred  primarily  because  of 
the  inappropriateness  of  the  considered  loss  function  for  decision  problems 
such  as  this  teaching  situation..  One  must  look  particularly  at  the  preferences 
for  taking  the  two  terminal  actions  ac  and  implied  by  the  loss  matrix  L^(4). 
The  backward  induction  solution  is  started  by  pairing  plays  of  the  game  with 
common  histories  up  to  the  last  vertex  of  the  play,  but  one  member  of  the  pair 
has  the  choice  £*,  and  the  other  member  has  the  choice  C*  as  the  value  of  its 
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terminal  vertex.  Let  h  be  the  generic  symbol  to  stand  for  a  representative 
history  along  a  play  of  the  game  up  to  the  last  vertex  and  then  consider  the 
conditional  probabilities  and  j/(5'*)1+|h|.  The  set  of  all  these 

conditional  probabilities  for  the  representative  pair  of  plays  [h, (C*)^]  and 
[h, (C*)^]  is  shown  in  Figure  9*  The  graph  of  the  set  of  these  representative 
pairs  of  conditional  probabilities  is  shown  to  be  the  line  connecting  the 
points  (0,l)  ana  (l,o)  in  the  plane. 

The  best  terminal  action  among  the  pairs  of  these  longest  plays  in  the 

game  is  determined  by  computing  the  expected  loss  over  the  distributions 

defined  on  the  states  C*  and  C*  for  the  actions  and  a  and  a The  best  action 

c  c 

at  the  level  of  the  outermost  vertices  of  the  class  of  longest  plays  is  then 
the  action  which  minimizes  this  expected  loss.  Consequently,  it  can  be  seen 
that  the  loss  matrix  L^(4)  implies  a  partition  of  the  sets  of  conditional 
probabilities  pvf(C*) j,  |hj~  a^d  into  three  subsets; 


(l)  Region  of  preference  for  the  action,  a£ 


--the  terminal  action  a£  will  be  best  when  p"  (C*)jJhr  < 


(2)  Indifference  point 

--the  terminal  actions  a  and  a'v  are  equally  preferred  when 

c  c 


{(C*)4lh|  =  .5 


(3)  Region  of  preference  for  the  action, 

--the  terminal  action  will  be  best  when  p-f(C*)4|hl  >-5 
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Figure  9 
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In  Example  2,  the  initial-state  distribution  and  the  learning-rate  para¬ 
meters  are  such  that,  for  practically  all  of  these  pairs  of  plays  of  the  game, 
the  conditional  probabilities  p{(c*)4lh}  and  P^(C*)^ |kj"  fall  in  the  region  of 
preference  for  the  action  a~.  Consequently,  the  best  strategy,  so  to  speak, 
seeks  out  the  item  allocation  scheme  which  will  maximize  the  conditional 


probabilities 


p((9f#)1+|h| 


in  the  various  pairs  of  longest  plays  of  the  game  in 


order  to  minimize  the  overall  Bayes  risk  of  the  design. 


Best  designs  when  losses  are  expressed  differentially  in  terms  of  errors 
It  would  appear  that  the  preference  regions  for  the  terminal  actions 
implied  by  the  definition  of  the  loss  matrix  do  not  reflect  the  prefer¬ 

ences  for  these  terminal  actions  that  sound  teaching  objectives  would  dictate. 
It  seems  safe  to  say  that  most  educators  would  prefer  to  expand  the  region  of 
preference  for  the  terminal  action  a^;  that  is,  they  would  prefer  to  conclude 
that  a  student  had  learned  the  required  concepts  unless  the  probability  that 
the  student  had  mastered  the  materials  was  quite  small. 

A  simple  modification  of  the  loss  matrix  L^(j5)  will  be  considered  in  the 
next  three  examples  which  leads  to  more  appropriate  strategies  from  the  stand¬ 
point  of  teaching  objectives.  Let  the  preference  partition  of  the  sets  of 
conditional  probabilities  be  prescribed  in  the  following 

alternative  way: 


(lf)  Region  of  preference  for  the  action  ,.a 


— the  terminal  action  a~  shall  be  best  when  p* 

c  r 


ij(c*)  |hj  <  .2 
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i 


(2')  Indifference  point 

— the  terminal  actions  and  a£  are  equally  preferred  when 
p|(C*)4|h|=  .2 

(5’ )  Region  of  preference  for  the  action,  a^ 

--the  terminal  action  a^  shall  be  best  when  p^(C*)u|h|  >.2 

The  choice  of  the  particular  alternative  indifference  point  (.2,  .8)  was  made 
only  to  illustrate  the  general  direction  of  change  in  the  preference  partition 
which  should  be  followed. 

There  are,  of  course,  many  ways  to  alter  the  loss  matrix  1.^(4)  which  will 
satisfy  this  alternative  preference  partition.  However,  it  seems  especially 
suitable  to  alter  only  the  second  column  of  the  matrix  to  preserve  the  common 
definition  of  losses  in  taking  the  action  over  the  trial  numbers  1,2,3;  and 
4.  Thus,  to  satisfy  this  preference  partition  the  loss  plus  cost  matrix 
Lg(4)  will  be  used,  where 


L2(4) 


a  a'v 


c 

C 

•  5 

1.3 

1.5 

1.05 

C* 


The  vectors  1^(1),  1^(2),  and  1^(3)  in  the  following  three  examples  are  not 
changed. 

Example  3:  Best  Design  for  a  Rapid  Learners  Population 
The  conditions  in  this  problem  are  identical  to  Example  1  with  the  single 
exception  that  the  Lg(4)  matrix  is  used  instead  of  L^,(4).  The  best  design  for 
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this  example  turned  out  to  be  the  some  design  found  to  be  best  for  Example  1. 


p{(c*)4lhj- 


for  the  various 


Since  most  of  the  conditional  probabilities 
histories  h  exceeded  .5  at  the  end  of  the  third  trial,  it  should  not  be  too 
surprising  that  the  best  design  for  Example  5  turns  out  to  be  identical  to  the 
best  design  in  Example  1.  (The  detailed  illustration  of  the  solution  for  the 
best  single-trial  strategy  which  was  shown  in  Table  2.1  and  Table  2.2  used  the 
parameters,  cost  function,  and  loss  function  of  Example  J.  One  may  derive  the 
values  in  those  tables  which  were  presented  as  given  numbers  by  performing  the 
required  computations  using  the  input  values  defined  for  Example  5.) 

Example  4:  Best  Design  for  a  Slow  Learners  Population 

The  conditions  in  this  problem,  other  than  the  substitution  of  the  L^(4) 
matrix  for  the  L^(4)  matrix,  are  identical  to  the  conditions  of  Example  2.  The 
tree  of  the  best  design  which  resulted  for  Example  k  is  shown  in  Figure  10. 
Comparison  of  the  trees  in  Figure  8  and  Figure  10  will  reveal  that  the  modifi¬ 
cation  of  the  loss  function  carried  out  in  the  Lg(4)  matrix  has  had  a  marked 
effect  on  the  structure  of  the  best  strategy.  Foremost  among  the  effects  one 
observes  is  that  the  new  best  strategy  employs  item  allocations  which  lead, 
with  a  single  exception,  to  the  terminal  conclusion  that  the  student  has 
mastered  the  concepts,  the  conclusion  a  •  Also  the  general  rule  is  detected 
in  this  strategy  that,  after  starting  with  a  B  item,  one  switches  to  A  items 
following  a  correct  response,  and,  for  both  A  and  B  itmes,  one  stays  with  the 
same  type  of  item  if  an  incorrect  response  occurs.  This  procedure  intuitively 
would  seem  sound  in  view  of  the  fact  that  the  guessing  probabilities,  g^  and 
gg,  are  assumed  to  have  the  value  0  in  these  examples. 
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Example  5:  An  Equal  Mixture  of  the  Populations  and  to0 

To  further  highlight  the  role  of  the  prior  distribution  in  determining 
the  structure  of  the  best  strategy  the  vector  of  prior  probabilities,  say, 

7r!  =  [.5).5l,  was  used  in  this  example.  The  other  input  values  for  this 

3 

example  were  identical  to  the  conditions  of  Examples  3  and  b. 

The  tree  of  the  best  design  for  this  example  is  shown  in  Figure  11.  The 
item  allocation  configuration  which  occurs  in  this  tree  does  not  differ 
radically  from  the  corresponding  configurations  of  the  trees  of  the  best 
designs  for  Examples  3  and  k.  Roughly  speaking,  one  could  also  say  that  the 
terminal  actions  for  Example  5  agree  fairly  closely  with  those  of  Examples  3 
and  b.  The  sampling  plan  component  of  these  best  pure  sequential  strategies 
seems  to  be  the  component  which  has  changed  the  most  from  example  to  example. 

Incidental  remarks  about  the  best  strategies  in  these  five  examples 

It  is  interesting  to  note  that  throughout  all  five  examples  the  best 
strategies  were  initiated  by  the  administration  of  the  more  difficult  type  B 
item.  Surely,  this  cannot  represent  a  universal  rule  in  these  two- concept 
problems,  as  one  could  easily  alter  this  result  by  appropriate  choice  of  the 
initial  distribution  on  the  four  states  of  the  conditioning  function.  However, 
in  these  examples,  where  in  each  case  the  initial  probabilities  of  being  in  the 
states  C  =  (CA,  and  C0  =  (^CB)IJ(CA  , C^)  are  equal,  the  general 

rule  that  it  is  best  to  allow  more  trials  for  exposure  to  Concept  B  than  to 
Concept  A  may  obtain. 

For  the  purpose  of  providing  some  basis  of  expressing  the  reduction  in 
the  risk  that  is  gained  by  following  the  optimal  strategies,  the  Bayes  risks 


Example  5:  Tree  of  the  Best  Design 
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of  the  best  strategies  and  the  risks  that  would  occur  if  no  experimentation 
were  performed  have  been  listed  for  each  example  in  Table  5  below: 


Risks 

No  Experimentation 

Best  Strategy 

Example 

1 

.50 

.232 

2 

1.00 

.5775 

5 

.50 

.232 

k 

1.00 

.86425 

5 

.75 

. 560625 

Table  3 
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6 .  Remarks 

Solutions  for  lest  designs  of  teaching  experiments  by  backward  induction 
down  the  trees  of  the  extensive  form  of  such  games  were  shown  in  Part  5* 

These  solutions  reduce  the  computational  effort  required  to  obtain  the  Bayes 
solution  or  best  strategy  for  these  games  by  an  appreciable  amount  over  the 
effort  demanded  for  sheer  enumeration  of  the  risk  of  each  pure  strategy.  In 
this  final  part  of  the  report  an  outline  will  be  given  of  the  proof  that  the 
backward  induction  teachnique  does  yield  the  strategy  which  is  Bayes  against 
the  prior  distribution  jt.  The  role  of  sufficient  statistics  for  further 
reducing  the  computational  effort  to  solve  for  Bayes  strategies  will  then  be 
examined  briefly.  Finally  some  methods  for  determining  "reasonably  good" 
strategies  will  be  discussed. 

Solution  by  Backwards  Induction  Yields  Bayes  Strategy 

A  formal  proof  that  the  backward  induction  solution  technique  yields  a 
pure  strategy  (e^h^d)  which  is  Bayes  against  the  given  prior  distribution  n 
can  be  carried  out  by  straightforward  generalization  of  well-known  proofs. 
These  are  proofs  that  the  backward  induction  technique  yields  the  Bayes 
strategy  (b,  d)  in  the  sequential  game  involving  a  fixed  experiment  e  (See 
Blackwell  and  Girschick  [4,  Chapter  93  for  such  a  proof).  The  extension  of 
the  proof  that  the  backwards  induction  technique  which  yields  the  sampling 
plan  b  and  terminal  decision  function  d  that  is  Bayes  against  n  for  the  fixed 
experiment  sequential  design  problem  is  outlined  below. 

It  should  be  noted  at  the  outset  that  the  sequential  teaching  games  under 
consideration  here  involve  finite  sets  of  terminal  actions;  consequently,  at 
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each  folding -back  step  in  the  backward  induction  process  it  is  possible  to 
select  the  terminal  action  a  e  A.  which  actually  achieves  the  minimum  expected 
terminal  loss.  It  is  the  finiteness  of  the  set  of  pure  strategies  for  nature 
and  of  the  set  of  terminal  actions  which  results  in  a  Bayes  solution  that  is 
a  pure  strategy. 

Since  there  are  only  a  finite  number  of  pure  experiments  e  in  the  set  of 

all  possible  pure  strategies  for  these  truncated  teaching  games,  one  can  in 

principle  find  the  Bayes  solution  by  backward  induction  for  each  fixed 

experiment  e  e  E  and  then  select, as  the  Bayes  solution  for  the  sequential 

design  problem^the  strategy  (e,b,  d)  which  has  minimum  risk  against  *  over  the 

set  of  all  Bayes  solutions  for  each  fixed  experiment.  That  is,  let  e^e^,  ..., 

be  the  set  of  all  pure  experiments  possible  in  these  truncated  sequential 

games,  m  in  number,  and  for  a  fixed  experiment  let  (b*,d*)  be  the  sampling 

plan  and  terminal  decision  function  which  minimizes  the  risk 

n 


n,  (b,d) 


.  y 


t=l 


v 

reb 


\ 

/_> 

ayeS 


ct(r) 


Thus  let  the  set  j(e^,b*,d*),  (e2>b*,d*)  •  •  *,  (j^b*, d*)  j-  represent  the  set  of 
Bayes  solutions  for  each  of  the  sequential  games  involving  fixed  pure  experi¬ 
ments  e  ,e^, . . *>6^.  Further,  let  (e*,b*,d*)  be  that  strategy  in  this  set  of 
strategies  which  has  minimum  risk  among  the  members  of  the  set.  It  is  clear 
that  (e*,b*,d*)  is  the  Bayes  solution  to  the  sequential  design  problem,  i.e., 
this  strategy  minimizes 
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The  extension  of  the  proof  that  the  backward  induction  solution  for  fixed 
experiment  sequential  games  is  Bayes  against  it  to  proving  that  the  backwards 
induction  solution  in  the  sequential  design  situation  also  yields  a  strategy 
which  is  Bayes  against  the  prior  distribution  n  could  perhaps  be  made  clearer 
had  we  represented  the  extensive  form  of  the  sequential  design  game  by  a  tree 
such  as  the  one  illustrated  in  Figure  12.  In  that  figure,  a  tree  of  a  2- 
trial  truncated  teaching  game  is  graphed.  For  the  two-concept  model  there  are 
8  pure  experiments  possible  and  each  of  these  includes  the  possibility  of 
no  experimentation.  In  the  tree  of  Figure  12,  the  8  possible  pure  experi¬ 
ments  have  been  segregated  and  the  experimenter  * s  initial  choice  set  at 
Trial  1  requires  a  choice  of  one  of  the  8  pure  experiments  or  the  choice  of 
the  dummy  experiment  which  involves  no  experimentation.  The  backwards 
induction  solution  process  can  thus  be  carried  out  separately  for  each  of 
the  8  pure  experiments,  and  finally  at  the  level  of  Trial  1,  the  choice  of 
the  best  sequential  design  is  made  by  seeking  among  the  best  strategies  in 
each  of  the  9  experiments  for  the  strategy  with  minimum  risk. 

The  trees  of  the  type  shown  in  Figure  5  seemed  a  more  efficient  representa¬ 
tion  of  these  sequential  design  problems  for  computational  purposes.  The 
equivalence  of  the  games  represented  by  the  two  types  of  trees  shown  in  Figures  5 
and  12  is  easily  shown  by  reduction  of  each  game  to  normal  form. 
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Solutions  for  Best  Designs  in  Terms  of  Posterior  Probabilities 

The  best  designs  of  teaching  experiments  which  were  obtained  in  Part  5 
tell  the  experimenter  what  he  should  do  for  each  sample  sequence  that  may  occur 
in  the  sample  space  of  the  best  experiment.  In  certain  problems,  it  is  possible 
to  obtain  closed- form  solutions  for  best  sequential  designs  by  considering  the 
space  of  posterior  distributions  of  the  parameters,  given  the  various  sub-sequences 
of  sample  values.  For  example,  when  there  are  only  two  states  of  nature  and 
two  terminal  actions,  then  the  Bayes  strategy  is  equivalent  to  the  sequential 
probability  ratio  procedure.  Particularly  when  the  components  of  the  sample 
sequences  are  independently  distributed,  the  sequential  probability  ratio  test 
can  be  readily  determined  by  finding  two  critical  values  which  partition  the 
unit  interval  into  three  regions.  These  critical  values  can  frequently  be  solved 
for  by  closed- form  procedures. 

Other  sequential  games  with  finite  and  equal  numbers  of  states  of  nature 
and  terminal  actions  plus  special  types  of  loss  functions  can  be  solved  by 
fairly  straightforward  methods.  In  the  examples  considered  in  this  paper, 
there  did  not  appear  to  be  any  special  advantages  to  seeking  for  best  design 
in  terms  of  the  space  of  posterior  distributions  rather  than  in  the  sample 
space  of  the  observable  outcomes,  because  of  the  loss  functions  employed  and 
because  of  the  numbers  of  states  of  nature  distinguished  (n+2  states).  We  are 
currently  examining  some  special  cases  of  sequential  strategies  in  teaching 
programs,  for  which  it  appears  that  closed-form  solutions  may  be  obtainable  in 


t  • 


terms  of  decision  regions  in  the  space  of  posterior  distributions  on  the  para¬ 
meters.  Results  of  these  studies  will  be  published  in  subsequent  reports. 
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Role  of  Sufficient  Statistics 

Iii  many  statistical  games  it  is  possible  to  partition  the  sample  space 
into  sets  of  outcomes  such  that  all  the  elements  in  a  given  cell  of  the 
partition  are, in  a  certain  sense,  equally  informative  about  the  parameters  or 
states  of  nature.  Such  partitions  of  the  sample  space  are  called  sufficient 
partitions  of  the  sample  space.  A  standard  definition  of  a  sufficient  parti¬ 
tion,  say  S,  of  a  sample  space  X  is  to  call  a  partition  S  of  X  sufficient  if 
for  every  subset  A  of  X  and  for  every  cell  or  subset  s  of  the  partition  S  for 
which  p(s)  >  0,  the  conditional  probability  p(A|  cn,  s)  is  independent  of  the 
parameter  points  o>  e  ft.  An  alternative,  equivalent  definition  stated  in  the 
language  of  Bayesian  theory  is  that  a  partition  S  of  the  sample  space  X  is 
sufficient  if  the  posterior  distribution  on  0,  given  that  an  outcome  x  is  an 
element  of  a  cell  s  of  S,  is  the  same  as  the  posterior  distribution,  given  the 
actual  value  of  the  outcome  x  for  every  prior  distribution  n  on  ft  and  for 

every  outcome  x  such  that  /  p{x  |o>) Ttfcn)  >  0. 

0)  eft 

A  function  or  random  variable  T  whose  domain  is  a  sufficient  partition  S 
of  a  sample  space  X  is  called  a  sufficient  statistic.  Since  the  elements  of 
many  sets,  can  be  put  in  one-to-one  correspondence  with  the  cells  s  of  S,  a 
sufficient  partition  determines  many  sufficient  statistics.  The  range  of  a 
sufficient  statistic  T  is  a  new  outcome  space  generated  by  T.  In  many  situ¬ 
ations,  the  range  space  of  T  has  a  considerably  smaller  number  of  elements 
than  the  original  outcome  space  X.  When  this  kind  of  reduction  of  the 
number  of  relevant  outcomes  to  be  considered  in  the  representation  of  the 


9  April  1965 


110 


TM- ll6l/000/00 


tree  of  the  extensive  form  of  a  sequential  design  game  is  possible  through  the 
identification  of  sufficient  statistics,  the.  computational  effort  to  carry  out 
the  backward  induction  solution  will  be  correspondingly  reduced. 

It  is  evident  that  even  for  models  of  teaching  processes  of  the  simple 
type  that  have  been  considered  in  this  paper,  the  computation  required  even 
by  the  backward  induction  process  will  rapidly  become  excessive  with  increase 
in  the  truncation  trial  number  n.  For  some  models  of  teaching  process  which 
admit  of  coarse  sufficient  partitions  of  the  sample  spaces,  it  may  be  possible 
to  extend  the  truncation  trial  number  to  a  level  approaching  common  practice 
in  certain  types  of  teaching  experiments.  Many  models  of  learning  or  teaching 
processes  take  infinite- dimensional  sets  for  their  outcome  spaces.  Both  the 
backward  induction  solution  concept  and  the  concept  of  reduction  of  the  sample 
space  to  the  range  space  of  a  sufficient  statistic  must  be  modified  in  the 
case  of  infinite- dimensional  outcome  sequences. 

Raiffa  and  Schlaifer  [l8]  have  very  comprehensively  examined  a  large 
number  of  statistical  decision  problems  involving  some  familiar  probability 
distributions  by  representing  these  problems  as  games  in  extensive  form.  They 
do  not  consider  problems  involving  time -dependent  processes,  such  as  the  learn¬ 
ing  process  being  considered  her e,  but  they  do  make  several  observations  about 
solving  statistical  games  in  extensive  form  that  are  pertinent  also  to 
these  stochastic  process  problems.  These  authors  emphasize  two  features  of 
a  statistical  game  which  markedly  simplify  the  computations  required  to  deter¬ 
mine  best  strategies  by  the  backward- induct ion- solution  technique; 
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(1)  The  existence  of  sufficient  statistics  of  fixed,  small  dimensionality 
(essentially  this  condition  on  a  sufficient  statistic  T  means  that 
its  range  space  should  have  a  6raall  number  of  elements). 

(2)  The  existence  of  prior  distributions  on  ft  which  have  a  property 

that  they  call  "being  a  natural- conjugate"  of  the  conditional  distri¬ 
butions  of  the  sufficient  statistics  given  an  experiment  e  and  parameter 
0). 

Their  second  condition  results  in  simplification  of  the  computation  of  the 
expected  losses  associated  with  each  individual  play  of  the  game  or  each  value 
of  the  sufficient  statistic.  Raiffa  and  Schlaifer  refer  to  this  phase  of  the 
solution  as  terminal  analysis.  It  would  appear  that  the  probability  distri¬ 
butions  on  the  outcome  sequences,  or  sufficient  contractions  thereof  of  many 
current  learning  models  are  not  apt  to  admit  of  conjugate  prior  distributions 
at  least  of  reasonably  simple  forms. 

Relating  Structures  of  Best  Designs  to  Characteristics  of  the  Parameter  Space 

The  backward  induction  technique  is  a  very  generally  applicable  method  of 
solving  for  best  designs  in  teaching  experiments;  however,  without  some 
further  analysis  of  the  relationship  of  the  structures  of  best  designs  to 
characteristics  of  the  parameter  space  of  these  experiments  one  would  have 
little  basis  for  predicting  general  directions  of  change  in  the  designs  as 
changes  are  made  in  the  parameter  space.  For  example,  some  general  theory 
should  be  developed  concerning  the  relationship  of  changes  in  the  item- 
allocation  portion  of  the  structure  of  a  strategy  as  one  changes  the 
values  of  the  initial  distribution  on  the  states  of  the  conditioning  function, 
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or  the  values  of  the  various  learning-rate  parameters,  or  the  values  of  the 
probabilities  of  guessing  correct  answers. 

In  the  illustrations  of  designs  of  teaching  experiments  given  in  Part  5 
the  parameter  space  ft  was  restricted  to  contain  only  two  elements.  Thus  in 
these  illustrations  the  design  problem  was  viewed  in  part  as  a  simple  classi¬ 
fication  problem  (classifying  each  individual  student  as  a  member  of  one  of 
two  populations)  along  with  other  objectives  concerned  with  deciding  when  a 
student  had  mastered  the  concepts  in  the  teaching  program.  For  many  practical 
applications  of  these  design  techniques  it  may  be  adequate  to  define  the 
parameter  space  ft  to  consist  of  a  small,  finite  number  of  "well- separated" 
points.  In  such  applications,  one  would  anticipate  that  an  experimenter 
could  assign  prior  probabilities  directly  to  each  element  of  ft  for  every 
student  who  is  brought  into  the  automated  teaching  situation. 

It  may  frequently  turn  out,  on  the  other  hand,  that  the  representation  of 
(las  a  small,  finite  set  is  not  practically  adequate  nor  analytically  con¬ 
venient.  One  may  find,  for  example,  that  it  is  much  more  appropriate  to 
expand  ft  so  that  the  experimenter  can  specify  his  prior  distributions  by 
simply  specifying  a  few  parameters  of  distributions  over  ft.  Raiffa  and 
Schlaifer  [l8]  discuss  this  mode  of  assignment  of  prior  distributions  at 
length  and  specify  a  number  of  desiderata  that  prior  distributions  should 
satisfy  in  order  to  make  the  solutions  for  best  strategies  tractable. 

Relating  Structures  of  Best  Designs  to  Fofm  of  Loss  and  Cost  Functions 

It  is  unfortunate  that  the  structure  of  best  strategies  in  statisti¬ 
cal  games  often  vary  quite  sensitively  with  changes  in  the  specification 
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of  the  loss  and  cost  functions.  Again,  it  would  be  desirable  to  examine 
these  learning  or  teaching  experiment  situations  to  supplement  solution 
techniques  with  some  general  analysis  of  the  relationship  of  directions 
of  changes  in  the  structures  of  best  designs  to  changes  in  the  loss 
and  cost  functions. 

Approximations  to  Solutions  When  the  Numbers  of  Trials  Are  Large 

If  one  is  faced  with  solving  for  a  best  design  in  a  problem  involving  a 
large  number  of  trials  where  it  is  not  possible  to  reduce  the  many  branches  of 
the  tree  to  a  more  feasible  number,  one  might  hope  to  get  a  good  approximation 
to  the  best  design  by  considering  solutions  for  best  strategies  for  a  number 
of  subproblems  consisting  of  "looking  forward"  as  many  trials  as  computational 
time  sind  costs  will  allow. 

For  example,  in  problems  like  those  considered  in  Part  5,  if  costs  of 
experimentation  allowed  for  continuation  through,  say,  JO  trials  but  the 
computation  of  best  strategies  could  only  be  afforded  for  the  J- trial  sub¬ 
problems  one  could  put  together  an  approximation  to  the  best  strategy  with 
(at  most)  10  pieces  or  substrategies.  In  such  situations,  one  would  terminate 
experimentation  at  any  point  where  the  terminal  decision  was  made  that  the 
subject  had  mastered  the  concepts,  a  ,  but  a  continuation  for  another  three 
-trials  would  be  made  at  any  terminal  vertex  where  the  decision  was  made. 

In  Example  5  of  Part  5,  it  is  found  that  only  a  single  terminal  vertex 
would  lead  to  such  a  continuation.  The  sequence  associated  with  that  vertex 
is  [  (B,Rb)1,  (B,?y2,  (a^)^].  In  this  circumstance,  one  could  deter- 
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mine  a  continuation  that  would  be  the  best  strategy  for  the  next  three  trials  by 
computing  the  posterior  distribution  on  SI  given  this  outcome  sequence;  e.g., 

In  Example  5  the  prior  distribution  was  assumed  to  be  n  *  =  (.5,  ,5);  thus  when 
these  and  the  other  known  values  are  substituted  into  the  right-hand  side  of 
this  expression, one  obtains 

p|cuLl(B,?tB)1,(B,?{B)2,(B,RB)3j-  =  .3(^63) _  =.983; 

.5(^365)  +  . 5( .010) 

consequently;  |(B,?[b)i,  (B,Rg)2,  (B,?^)^]-  =  .017.  Using  this  posterior 
distribution  of  the  parameters;  one  could  then  seek  the  best  strategy  for  the 
next  three  trials  by  the  techniques  of  Part  5. 

In  some  circumstances;  one  might  also  reduce  the  computations  for 
finding  best  strategies  in  truncated  teaching  experiments  involving  a 
large  number  of  trials  by  redefining  a  trial  to  include  a  block  of  items. 

The  success  of  such  reductions  of  the  computational  difficulties  would  of 
course  hinge  largely  on  what  effects  such  aggregations  had  on  the  relative  com¬ 
plexity  of  the  probability  distributions  of  the  sequences  of  item  blocks.  Other 
reductions  of  the  complexity  of  these  problems  suggest  themselves  too;  promin¬ 
ently,  the  various  objectives  which  one  may  try  to  achieve  can  be  modified  in 
a  number  of  ways  to  simplify  the  task  of  determining  best  terminal  actions. 
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